CN115205264A - High-resolution remote sensing ship detection method based on improved YOLOv4 - Google Patents

High-resolution remote sensing ship detection method based on improved YOLOv4 Download PDF

Info

Publication number
CN115205264A
CN115205264A CN202210860051.6A CN202210860051A CN115205264A CN 115205264 A CN115205264 A CN 115205264A CN 202210860051 A CN202210860051 A CN 202210860051A CN 115205264 A CN115205264 A CN 115205264A
Authority
CN
China
Prior art keywords
module
improved
yolov4
weight
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210860051.6A
Other languages
Chinese (zh)
Inventor
许鑫
陈巍
陈伟
贺晨煜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing Institute of Technology
Original Assignee
Nanjing Institute of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing Institute of Technology filed Critical Nanjing Institute of Technology
Priority to CN202210860051.6A priority Critical patent/CN115205264A/en
Publication of CN115205264A publication Critical patent/CN115205264A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0004Industrial image inspection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes
    • G06V20/17Terrestrial scenes taken from planes or by drones
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20092Interactive image processing based on input by user
    • G06T2207/20104Interactive definition of region of interest [ROI]

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Multimedia (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Remote Sensing (AREA)
  • Medical Informatics (AREA)
  • Quality & Reliability (AREA)
  • Databases & Information Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Image Analysis (AREA)

Abstract

A high-resolution remote sensing ship detection method based on improved YOLOv4 comprises the steps of collecting an original data set and establishing a target data set; adding a void space pyramid pooling module and a CPA module into an improved YOLOV4 network; in combination with the existing YOLOV4 method, an SPP module is replaced by an ASPP module, and the effect of detecting small targets is improved by adding multi-scale feature fusion; a CPA module is added into the improved feature fusion network to improve feature extraction effectiveness; in the stage of calculating the loss function, the XIoU loss function is used for replacing the CIoU loss function, and the problems of poor positioning precision and long training time of ships with large length-width ratio in network training are solved. The detection precision of the method is tested and verified by combining a large number of images collected by actual aerial photography, and experiments show that the method can realize automatic real-time detection of sea surface ships, improves the detection precision and efficiency, and has better detection effect compared with the traditional target detection technology.

Description

High-resolution remote sensing ship detection method based on improved YOLOv4
Technical Field
The invention relates to the technical field of ship detection, in particular to a high-resolution remote sensing ship detection method based on improved YOLOv 4.
Background
The optical remote sensing image has the characteristics of high spatial resolution, rich image content, small geometric deformation and the like, and has important value in many research fields (such as resource investigation, military exploration, ocean research and the like). The ship target detection based on the high-resolution remote sensing image is a hotspot and difficult problem in the field of machine vision, the main task is to judge the specific coordinate position of the ship target in the optical remote sensing image, and the method has great significance in civil, commercial, military and other aspects, not only can make an excellent contribution to sea supervision in coastal areas, but also can influence the national economy and territorial safety. Therefore, it is of great significance to research how to rapidly and accurately detect the ship target.
With the continuous development of remote sensing technology, geographic information system and other technologies in recent years, the traditional ship detection task completed by a manual method cannot meet the task requirements. The ship detection methods reported or put into use in the prior art can be roughly classified into two types: a traditional image processing method and a target detection algorithm based on deep learning. The former is mostly focused on filtering, feature extraction, threshold segmentation, edge detection, and the like. However, these methods are only suitable for application scenes with simple weather conditions and calm sea surface, and are susceptible to the influence of illumination intensity and sea surface light spots in complex scenes, so that the detection accuracy is reduced. Therefore, the application scenes of the methods are single and the methods have no universality. The convolutional neural network is represented by a convolutional neural network, the convolutional neural network can learn richer semantic information and high-level image feature representation through massive data training, the robustness is higher, and the detection effect is more efficient. Many researchers have achieved significant performance by applying deep learning techniques to marine vessel target detection. At present, network models adopted for ship detection tasks are mainly divided into Two categories, one category is a region-based Two-stage algorithm, and a target candidate frame needs to be generated firstly, and then classification and regression are carried out on the candidate frame. The other type is an One-stage algorithm which utilizes a regression idea and can directly predict the categories and positions of different targets. However, the traditional networks have shallow layers, cannot sufficiently extract the textural features of the ship target in the high-resolution remote sensing image, and have poor detection effect on the small ship target. And because the ship has the characteristic of large length-width ratio, the position loss function in the traditional deep learning can not accurately obtain the positioning information, so that the positioning deviation phenomenon exists.
Therefore, it is urgently needed to provide a method for improving the accuracy and robustness of the ship target detection of the high-resolution remote sensing image.
Disclosure of Invention
Aiming at the defects in the prior art, the invention provides a high-resolution remote sensing ship detection method based on improved YOLOv 4; the method can quickly and accurately realize the detection and identification of the ship target of the high-resolution remote sensing image.
In order to achieve the purpose, the invention adopts the following technical scheme:
a high-resolution remote sensing ship detection method based on improved YOLOv4 comprises the following steps:
s1: acquiring an original data set of a sea surface ship unmanned aerial vehicle aerial image, and establishing a target data set comprising a training set and a verification set;
s2: constructing an improved YOLOv4 network;
s3: designing an optimization loss function;
s4: training and verifying the improved YOLOv4 network in the step S2 by combining the optimization loss function designed in the step S3 and the training set and the verification set in the step S1; and then the ship target is detected and identified by the remote sensing image.
In order to optimize the technical scheme, the specific measures adopted further comprise:
further, the specific content of step S1 is:
s1.1: acquiring ship information on the sea surface by a sea surface ship unmanned aerial vehicle to obtain an original data set;
s1.2: marking a ship target in the original data set to obtain a target picture;
s1.3: preprocessing a target picture, comprising: cutting and zooming the target picture, wherein the zooming method specifically comprises the following steps: setting the maximum side length size as x, and the width and height of the current target picture as w and h respectively, then the width wnew = w min (x/w, x/h) of the zoomed target picture, and the height hnew = h min (x/w, x/h) of the zoomed target picture;
s1.4: establishing a target data set according to the cut and zoomed target picture, and dividing the target data set into a training set and a verification set; wherein the number ratio of the training set to the validation set is 4.
Further, the specific content of step S2 is:
s2.1: designing CPA modules
The CPA module comprises a channel attention module and a pixel attention module; and the calculation formula of the CPA module is as follows:
Figure BDA0003757163360000021
Figure BDA0003757163360000022
in the formula, M c (P) is the feature weight of the channel attention module output, P is the input feature map, P' is the feature map of the channel attention module output, M s (P ') is the feature weight of the pixel attention module output, and P' is the feature map of the CPA module output;
the design content of the channel attention module is as follows: for an input feature map P with the size of H, W, C, firstly carrying out global average pooling to obtain a feature map with 1, 1 and C so as to obtain global semantic information, realizing free lifting dimension of a channel by convolution of 2 1 and 1 so as to obtain depth channel feature information, obtaining the weight of channel features, and finally obtaining a new weighted feature map P' by multiplying the weight and feature values;
the pixel attention module comprises the following design contents: respectively using 3 convolution layers of 1 × 1 to generate a matrix Query, a Key and a Value for a new weighted feature map P 'obtained after the processing of the channel attention module, then performing Dot product Dot operation and Softmax function calculation on the matrix Query and the Key to obtain a weight with the size of (H × W), and performing Dot product Dot operation on the weight and the matrix Value to obtain a pixel-concerned weighted feature map P';
s2.2: designing ASPP modules
The ASPP module comprises three 3 × 3 cavity convolution layers with sampling rates of 6, 12 and 18 respectively, a 1 × 1 convolution layer and a global average pooling layer; generating 5 multi-scale features, splicing and fusing the 5 multi-scale features on channel dimensions, and finally adjusting the number of channels by using a 1 × 1 convolution layer;
s2.3: obtaining an improved YOLOv4 network structure
Based on the existing YOLOv4 network structure, a CPA module designed in the step S2.1 is added at the lower sampling front part, the CPA module is used for redistributing the channel characteristic weight and the pixel characteristic weight of the characteristic diagram, the detection precision is improved by increasing the weight of the interested characteristic region, and the background interference is reduced by reducing the weight of the irrelevant characteristic region; meanwhile, on the basis of the existing YOLOv4 network structure, an SPP module in the network is replaced by an ASPP module designed in the step S2.2; thereby obtaining an improved yollov 4 network structure.
Further, the specific content of step S3 is: replacing the CIoU loss function of the existing YOLOv4 network structure with the optimized XIoU loss function; wherein the optimized XIoU loss function L XIoU The method specifically comprises the following steps:
L XIoU =1-IoU+dIoU
Figure BDA0003757163360000031
in the formula, ioU is expressed as: ioU = | BBox t ∩BBox p |/|BBox t ∪BBox p In which BBox t Being the real frame of the target picture, BBox p The prediction frame of the target picture is taken as the IoU, and the ratio of the union set and the intersection set of the real frame and the prediction frame is taken as the IoU; AA ', BB', CC ', DD' are respectively represented as: the distances between the real frame of the target picture and the upper left corner point, the upper right corner point, the lower left corner and the lower right corner of the prediction frame are calculated, and c represents the slant length of the outer rectangle of the real frame and the prediction frame of the target picture; the dIoU is expressed as: with 4 corner points, data frames of the same shape and with the same aspect ratio can be fitted in the training.
The invention has the beneficial effects that:
1. when sea surface ship detection is carried out, the proportion of the backgrounds of the ocean, the port and the like in the whole image is large, and the length-width ratio of the ship is large. In order to suppress the interference of the background and effectively acquire the information of the foreground, a CPA module is embedded in the Neck to improve the visual representation; the CPA module designed by the application can redistribute the channel characteristic weight and the pixel characteristic weight of the characteristic diagram, improve the detection precision by increasing the weight of the interested characteristic region, reduce the interference of the background by reducing the weight of the irrelevant characteristic region, effectively selectively extract the characteristics according to the aspect ratio of the object, and finely distribute and process the information.
2. Because the ship target is small in the data set image, in order to improve the detection capability of the small target and enhance the semantic relation between local information and global information, a cavity space Pyramid Pooling (ASPP) is introduced to replace an original SPP module; compared with SPP, the ASPP integrates more scale features, and effectively improves the detection capability of small targets.
3. The method optimizes the loss function of YOLOv4, adopts the XIoU loss function to replace the CIoU loss function, can effectively reduce the calculated amount, and greatly improves the positioning precision of the ship target with large length-width ratio.
Drawings
Fig. 1 is a schematic diagram of the existing YOLOv4 network structure of the present invention.
Fig. 2 is a schematic structural diagram of a CPA module designed by the present invention.
FIG. 3 is a schematic diagram of a channel attention module structure designed by the present invention.
FIG. 4 is a schematic diagram of a pixel attention module according to the present invention.
Fig. 5 is a schematic structural diagram of an ASPP module designed by the present invention.
Fig. 6 is a schematic diagram of the improved YOLOv4 network structure of the present invention.
FIG. 7 is a schematic diagram of the CDIoU and CIoU loss functions of the present invention.
Fig. 8 is a schematic diagram comparing the detection effect of the traditional YOLOv4 network and the improved YOLOv4 network of the present application.
Detailed Description
The present invention will now be described in further detail with reference to the accompanying drawings.
The YOLO series algorithm is One of typical One-stage algorithms, and is the most widely applied algorithm at present. Redmon et al in 2016 proposed a YOLOV1 algorithm, which proposed combining the generated candidate boxes and classification regression into a network, greatly reduced the complexity of network computation, but had poor target localization accuracy. After a plurality of iterations, the method is continuously improved and the defects are made up, and the high-precision target can be realized on the premise of keeping the speed advantage by the current YOLOv 4. The YOLOv4 network structure is mainly composed of 4 parts, which are an input terminal, a Backbone network, a Neck and a Head, respectively. The specific network structure is shown in fig. 1.
As shown in fig. 1, YOLOv4 performs feature extraction on input data using a new backbone network CSPDarknet 53. The CSPDarknet53 is a modified version of Darknet53, integrating a Cross Stage Partial Network (CSPNet). The CSP module can reduce the calculation amount while maintaining the precision, and realizes the perfect combination of speed and precision. And in the target detection stage, a hack structure is usually inserted after the Backbone for better feature extraction. In YOLOv4, the Neck moiety includes both SPP and FPN + PAN structures. The SPP module can enable the input size of the convolutional neural network to be unlimited and perform multi-scale feature fusion. FPN + PAN Structure pairs FPN (feature pyramid) and PAN 【18】 The two structures are fused, top-downFPN can extract abundant semantic feature, and PAN from bottom to top can make full use of shallow layer feature and extract abundant location feature. The Neck can further improve the feature extraction capability, perform feature fusion and improve the detection precision. And finally, outputting three feature graphs with different scales by YOLOv4 to detect the targets with different sizes, then performing non-maximum suppression (NMS), and adjusting the prior frame to obtain a final result.
The application improves the YOLOv4 algorithm
1. Network architecture optimization
1.1 CPA Module
When sea surface ship detection is carried out, the proportion of the backgrounds of the ocean, the port and the like in the whole image is large, and the length-width ratio of the ship is large. In order to suppress the interference of the background and effectively acquire the information of the foreground, a CPA module is embedded in the tack to improve the visual representation. As shown in fig. 2, the CPA module is a lightweight and efficient Attention module, which is mainly composed of two sub-modules, a Channel-wise Attention module and a Pixel-wise Attention module, and the mathematical expression of the modules is shown in formula 1. Wherein
Figure BDA0003757163360000052
Representing multiplication of elements, P being the input profile, M c (P) is the feature weight of the channel attention module output, P' is the feature map of the channel attention module output, M s (P ') is the feature weight of the pixel attention module output, and P' is the feature map of the CPA output.
Figure BDA0003757163360000051
The input feature map firstly passes through a channel attention module, the specific structure of which is shown in fig. 3, and assuming that the size of the input feature map P is H × W × C, firstly, global Average Pooling (GAP) is performed on P to obtain a feature map of 1 × C, so as to obtain global semantic information, and 2 1 reel machines are used to realize the free lifting dimension of a channel so as to obtain depth channel feature information (scaling ratio k = 8), so as to obtain the weight of the channel feature. Finally, a new weighted feature map P' is obtained by multiplying the weight and the feature value.
The specific structure of the pixel Attention module is shown in fig. 4, and the Dot-product Attention module (Dot-product Attention) is applied to P' in the present invention. The present invention generates matrices Query, key, and Value (simplified to Q, K, and V) using 3 convolution layers of 1 × 1, respectively, and the number of channels of the matrices Q and K becomes 1/4 of the original number (scaling ratio K = 4). Then, a Dot product (Dot) operation and a Softmax function calculation are performed on Q and K, and a weight with a size of (H × W) × (H × W) is obtained. And obtaining a weighted feature map P' concerned by the pixel through the dot product of the weight and the V, wherein the specific process is shown in formula 2.
P attention (Q,K,V)=softmax(Q·K T )·V (2)
The CPA module can redistribute the channel characteristic weight and the pixel characteristic weight of the characteristic diagram, improve the detection precision by increasing the weight of the interested characteristic region, reduce the interference of the background by reducing the weight of the irrelevant characteristic region, effectively selectively extract the characteristics according to the length-width ratio of the object, and finely distribute and process the information.
1.2 void space pyramid pooling (ASPP) Module
Since the ship target is small in the data set image, in order to improve the detection capability of the small target and enhance the semantic relationship between the local information and the global information, a void space Pyramid Pooling (ASPP) is introduced to replace the original SPP module, so as to aggregate the multi-scale context information, and the specific structure of the multi-scale context information is shown in fig. 5. In YOLOv4, SPPNet mainly utilizes 3 different scale pooling layers for 3-scale feature extraction and feature fusion. And the ASPP generates 5 multi-scale features by introducing 3 × 3 hole convolution layers with sampling rates of (6, 12, 18), 1 × 1 convolution layer and a global average pooling layer, then splices and fuses the 5 multi-scale features in the channel dimension, and finally adjusts the number of channels by using the 1 × 1 convolution layer. Compared with SPP, the ASPP integrates more scale features, and effectively improves the detection capability of small targets.
2. Improved network structure
The invention adds two CPA modules and replaces SPP module with ASPP module, the improved network structure diagram is shown in figure 6.
3. Loss function optimization
In order to improve the detection accuracy, the loss function is improved. The loss function of Yolov4 consists of 3 parts, each of which is a positional regression loss function (loss) loc ) Loss of confidence function (loss) of object obj ) Loss function (loss) of object classification cls ) Specifically, it is shown in formula 3.
Loss=loss loc +loss obj +loss cls (3)
Among the position regression loss functions, YOLOv4 employs the CIoU loss function, which is specifically shown in the following formulas 4, 5, 6, and 7.
Figure BDA0003757163360000061
Figure BDA0003757163360000062
Figure BDA0003757163360000063
Figure BDA0003757163360000064
Wherein S P To predict the area of the box, S T D and c represent the central point distance between the prediction frame and the real frame, the oblique length of the wrapping rectangle, and w T 、h T Width and height of the real frame, w P 、h P The width and height of the box are predicted.
Because the CIoU loss function relates to the inverse trigonometric function, the calculation amount is large, and the training time of the network is prolonged. Since the detected ship has the particularity of the aspect ratio, the aspect ratio of the detected object needs to be considered, and therefore the XIoU loss function is used instead of the CIoU loss function, which is specifically shown in the following formula.
L XIoU =1-IoU+dIoU (8)
Figure BDA0003757163360000065
Wherein AA ', BB', CC 'and DD' are distances between the real frame and the upper left corner point, the upper right corner point, the lower left corner and the lower right corner of the prediction frame. Although the XIoU loss function does not take into account aspect ratio and center point distance as does the CIoU loss function, during the training process the model will slowly trend the 4 corner points of the prediction box towards the 4 corner points of the real box until they overlap.
The following description will be made by way of specific examples
The invention relates to a high-resolution remote sensing image ship detection method based on improved YOLOv4, which specifically comprises the following steps:
step 1, collecting an original data set of an unmanned aerial image of a sea surface ship, and establishing a target data set.
The original data set of the invention is from actual sea surface vessel unmanned aerial vehicle aerial images, which are 1800 sheets in total, wherein the training set 1440 is a verification set 360.
And marking the ship target in the original data set to obtain a target picture, covering the target area with the marking frame as much as possible during marking, and simultaneously reducing other background information in the marking frame.
In consideration of IO bottleneck in data training, the target picture is subjected to certain preprocessing, and clipping and scaling are performed on the basis of the target picture so as to reduce the size of an input picture in data training.
For scaling, assuming that the set maximum side length is l, and the width and height of the original picture are w and h, the width and height of the scaled target picture wnew and hnew are:
wnew=w*min(l/w,l/h)
hnew=h*min(l/w,l/h)
and establishing a target data set according to the cut and zoomed target picture, wherein the target data set is divided into a training set and a verification set.
Step 2, constructing an improved YOLOv4 network, comprising the following steps;
step 2.1, designing a CPA module, wherein the CPA module structure is shown in figure 2;
the CPA module consists of two sub-modules, a Channel-wise Attention module and a Pixel-wise Attention module. The channel attention module employs a SENET module, which contains one global averaging pooling layer and two 1 x 1 convolutional layers. The pixel attention module adopts a Dot product attention module, 3 convolution layers of 1 × 1 are respectively used for the feature map to generate matrixes Query, key and Value (simplified to Q, K and V), then Dot product (Dot) operation and Softmax function calculation are carried out on Q and K, and the obtained result and V are subjected to Dot product operation.
Step 2.2, adding an ASPP module, wherein the structure of the ASPP module is shown in figure 5;
the ASPP module generates 5 multi-scale features by introducing 3 sampling rates of 3 × 3 hole convolution layers, 1 × 1 convolution layer and a global average pooling layer which are respectively (6, 12 and 18), then splicing and fusing the 5 multi-scale features on channel dimensions, and finally adjusting the number of channels by using the 1 × 1 convolution layer, so that the receptive field can be effectively expanded, and the multi-scale structure can effectively improve the new detection energy of small targets.
And 2.3, acquiring an improved YOLOv4 network structure.
Based on the improvement of the first two steps, the invention proposes an improved YOLOv4 network model, as shown in fig. 6.
According to the invention, the CPA module is added at the front part of the down sampling in the feature fusion network, the CPA module is used for redistributing the channel feature weight and the pixel feature weight of the feature graph, the detection precision is improved by increasing the weight of the interested feature area, and the interference of the background is reduced by reducing the weight of the irrelevant feature area, so that the feature is effectively selectively extracted according to the aspect ratio of the object, and the information is finely distributed and processed. According to the invention, the SPP module is replaced by the ASPP module, so that the reception field is effectively enlarged, and multi-scale context information is aggregated to improve the detection performance of the small target ship.
Step 3, optimizing a loss function;
the method optimizes the loss function of YOLOv4, adopts the optimized XIoU loss function to replace the CIoU loss function, can effectively reduce the calculated amount, and greatly improves the positioning accuracy of the ship target with large length-width ratio.
And 4, in order to verify the effectiveness and the practicability of the target detection method in the technical scheme of the invention, a test set is adopted to test the target detection performance of the ship detection training network model, and the target detection performance is compared with other technologies applied to ship detection at the present stage.
The specific test contents are as follows:
the experimental environment is as follows: the system of Ubuntu18.04, the video card NVIDIA GeForce RTX2080Ti (the video memory is 11 GB), and the deep learning framework Pythrch;
the improved YOLOv4 model is trained on 300 iterations on an experimental data set. The XIoU loss function is shown in FIG. 7. As can be seen from fig. 7, the loss value gradually decays with increasing number of iterations and gradually stabilizes, eventually dropping to 0.9289%. Compared with the original CIoU loss function, the improvement of 1.46 percent is obtained.
The evaluation index uses the mAP (Mean Average Precision) and can reflect the overall level of Precision and accuracy. The improved yollov 4 mean accuracy (mAP) can reach 94.08%, and the overall effect of training is ideal.
The invention brings the improved modules in the network into yoolov 4 respectively for experiments, and as shown in table 1, we can clearly see the improved results of different improved modules for the network.
TABLE 1 mAP of improved modules in YOLOv4
Figure BDA0003757163360000081
The present invention uses five detection networks of fast RCNN, SSD, YOLOv3, YOLOv4 and the improved YOLOv4 proposed herein to perform comparative experiments on data sets, and the specific results are shown in table 2. It is analyzed that since the improved yollov 4 introduces 2 CPA modules and 1 ASPP module, the parameter quantity of the model is the largest in the four algorithms, and the detection speed is lower than yollov 3 and yollov 4, but the detection accuracy is the highest. And comprehensively evaluating, wherein the improved YOLOv4 has more ideal detection effect on the sea surface ship.
TABLE 2 network results comparison chart
Figure BDA0003757163360000091
In order to verify the effects of the network model before and after optimization, the original YOLOv4 and the improved YOLOv4 network models are used to perform a control experiment on the same data set, and the specific detection effect is shown in fig. 8.
It can be seen from the figure that, compared with the original YOLOv4, since the CPA module is embedded in the Neck by the improved YOLOv4 and the loss function is optimized, the confidence coefficient in the test image is improved to a certain extent. Because SPPNet is replaced by ASPPNet in the improved network model, the detection effect of the improved model on small objects is greatly improved. In the test image, the improved YOLOv4 can detect a small target which cannot be detected by the original network model.
The YOLOv4 algorithm is used as a mainstream algorithm in the field of current optical image target detection, and has the characteristics of high detection speed and high detection precision. Compared with the original YOLOV4 model, the improved training process of the YOLOV4 model is more convergent, 1.46% of improvement is achieved in a loss function, the complex calculated amount is effectively reduced, and the training speed is improved; and the target detection task is improved by 4.84% compared with the mAp0.5 of the original Yolov4 model. Therefore, the method has excellent detection effect on detecting the small-scale ship target and is suitable for being deployed on mobile equipment such as unmanned aerial vehicles.
It should be noted that the terms "upper", "lower", "left", "right", "front", "back", etc. used in the present invention are for clarity of description only, and are not intended to limit the scope of the present invention, and the relative relationship between the terms and the terms is not limited by the technical contents of the essential changes.
The above is only a preferred embodiment of the present invention, and the protection scope of the present invention is not limited to the above-mentioned embodiments, and all technical solutions belonging to the idea of the present invention belong to the protection scope of the present invention. It should be noted that modifications and adaptations to those skilled in the art without departing from the principles of the present invention may be apparent to those skilled in the relevant art and are intended to be within the scope of the present invention.

Claims (4)

1. A high-resolution remote sensing ship detection method based on improved YOLOv4 is characterized by comprising the following steps:
s1: acquiring an original data set of an unmanned aerial image of a sea surface ship, and establishing a target data set comprising a training set and a verification set;
s2: constructing an improved YOLOv4 network;
s3: designing an optimization loss function;
s4: training and verifying the improved YOLOv4 network in the step S2 by combining the optimization loss function designed in the step S3 and the training set and the verification set in the step S1; and then the remote sensing image ship target is detected and identified.
2. The improved YOLOv 4-based high-resolution remote sensing ship detection method according to claim 1, wherein the specific content of the step S1 is as follows:
s1.1: acquiring ship information of the sea surface through a sea surface ship unmanned aerial vehicle to obtain an original data set;
s1.2: marking a ship target in the original data set to obtain a target picture;
s1.3: preprocessing a target picture, comprising: cutting and zooming the target picture, wherein the zooming method specifically comprises the following steps: setting the maximum side length size as x, and the width and height of the current target picture as w and h respectively, then the width wnew = w min (x/w, x/h) of the zoomed target picture, and the height hnew = h min (x/w, x/h) of the zoomed target picture;
s1.4: establishing a target data set according to the cut and zoomed target picture, and dividing the target data set into a training set and a verification set; wherein the number ratio of the training set to the validation set is 4.
3. The improved YOLOv 4-based high-resolution remote sensing ship detection method according to claim 1, wherein the specific content of the step S2 is as follows:
s2.1: designing CPA modules
The CPA module comprises a channel attention module and a pixel attention module; and the calculation formula of the CPA module is as follows:
Figure FDA0003757163350000011
Figure FDA0003757163350000012
in the formula, M c (P) is the feature weight of the channel attention module output, P is the input feature map, P' is the feature map of the channel attention module output, M s (P ') is the feature weight of the pixel attention module output, and P' is the feature map of the CPA module output;
the design content of the channel attention module is as follows: for an input feature map P with the size of H, W, C, firstly carrying out global average pooling to obtain a feature map with 1, 1 and C so as to obtain global semantic information, realizing free lifting dimension of a channel by convolution of 2 1 and 1 so as to obtain depth channel feature information, obtaining the weight of channel features, and finally obtaining a new weighted feature map P' by multiplying the weight and feature values;
the pixel attention module comprises the following design contents: respectively using 3 convolution layers of 1 × 1 to generate a matrix Query, a Key and a Value for a new weighted feature map P 'obtained after the processing of the channel attention module, then performing Dot product Dot operation and Softmax function calculation on the matrix Query and the Key to obtain a weight with the size of (H × W), and performing Dot product Dot operation on the weight and the matrix Value to obtain a pixel-concerned weighted feature map P';
s2.2: designing ASPP modules
The ASPP module comprises three 3 × 3 hole convolution layers with sampling rates of 6, 12 and 18 respectively, a 1 × 1 convolution layer and a global average pooling layer; generating 5 multi-scale features, splicing and fusing the 5 multi-scale features on the channel dimension, and finally adjusting the number of channels by using a 1 × 1 convolution layer;
s2.3: obtaining an improved YOLOv4 network structure
Based on the existing YOLOv4 network structure, a CPA module designed in the step S2.1 is added at the lower sampling front part, the CPA module is used for redistributing the channel characteristic weight and the pixel characteristic weight of the characteristic diagram, the detection precision is improved by increasing the weight of the interested characteristic region, and the background interference is reduced by reducing the weight of the irrelevant characteristic region; meanwhile, on the basis of the existing YOLOv4 network structure, an SPP module in the network is replaced by an ASPP module designed in the step S2.2; thereby obtaining an improved yollov 4 network structure.
4. The improved YOLOv 4-based high-resolution remote sensing ship detection method according to claim 1, wherein the specific content of the step S3 is as follows: replacing the CIoU loss function of the existing YOLOv4 network structure with the optimized XIoU loss function; wherein the optimized XIoU loss function L XIoU The method specifically comprises the following steps:
L XIoU =1-IoU+dIoU
Figure FDA0003757163350000021
in the formula, ioU is expressed as: ioU = | BBox t ∩BBox p |/|BBox t ∪BBox p I, wherein BBox t Being the real frame of the target picture, BBox p The IoU is a ratio of a union set and an intersection set of a real frame and a prediction frame; AA ', BB', CC ', DD' are respectively represented as: the distances between the real frame of the target picture and the upper left corner point, the upper right corner point, the lower left corner and the lower right corner of the prediction frame are calculated, and c represents the slant length of the outer rectangle of the real frame and the prediction frame of the target picture; the dIoU is expressed as: with 4 corner points, data frames of the same shape and with the same aspect ratio can be fitted in the training.
CN202210860051.6A 2022-07-21 2022-07-21 High-resolution remote sensing ship detection method based on improved YOLOv4 Pending CN115205264A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210860051.6A CN115205264A (en) 2022-07-21 2022-07-21 High-resolution remote sensing ship detection method based on improved YOLOv4

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210860051.6A CN115205264A (en) 2022-07-21 2022-07-21 High-resolution remote sensing ship detection method based on improved YOLOv4

Publications (1)

Publication Number Publication Date
CN115205264A true CN115205264A (en) 2022-10-18

Family

ID=83584838

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210860051.6A Pending CN115205264A (en) 2022-07-21 2022-07-21 High-resolution remote sensing ship detection method based on improved YOLOv4

Country Status (1)

Country Link
CN (1) CN115205264A (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115909225A (en) * 2022-10-21 2023-04-04 武汉科技大学 OL-YoloV5 ship detection method based on online learning
CN115937703A (en) * 2022-11-30 2023-04-07 南京林业大学 Enhanced feature extraction method for remote sensing image target detection
CN115966009A (en) * 2023-01-03 2023-04-14 迪泰(浙江)通信技术有限公司 Intelligent ship detection system and method
CN116012601A (en) * 2023-01-16 2023-04-25 苏州大学 Yolo_sr system, target detection method and device for sweeping robot
CN116229191A (en) * 2023-03-13 2023-06-06 东莞理工学院 Target detection method based on normalized corner distance and target foreground information
CN117496666A (en) * 2023-11-16 2024-02-02 成都理工大学 Intelligent and efficient drowning rescue system and method
CN117541584A (en) * 2024-01-09 2024-02-09 中国飞机强度研究所 Mask rotation superposition full-machine test crack characteristic enhancement and identification method
CN117611877A (en) * 2023-10-30 2024-02-27 西安电子科技大学 LS-YOLO network-based remote sensing image landslide detection method
CN117854111A (en) * 2024-01-15 2024-04-09 江南大学 Improved YOLOv4 plasmodium detection method based on enhanced feature fusion

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115909225A (en) * 2022-10-21 2023-04-04 武汉科技大学 OL-YoloV5 ship detection method based on online learning
CN115937703B (en) * 2022-11-30 2024-05-03 南京林业大学 Enhanced feature extraction method for remote sensing image target detection
CN115937703A (en) * 2022-11-30 2023-04-07 南京林业大学 Enhanced feature extraction method for remote sensing image target detection
CN115966009A (en) * 2023-01-03 2023-04-14 迪泰(浙江)通信技术有限公司 Intelligent ship detection system and method
CN116012601A (en) * 2023-01-16 2023-04-25 苏州大学 Yolo_sr system, target detection method and device for sweeping robot
CN116229191A (en) * 2023-03-13 2023-06-06 东莞理工学院 Target detection method based on normalized corner distance and target foreground information
CN116229191B (en) * 2023-03-13 2023-08-29 东莞理工学院 Target detection method based on normalized corner distance and target foreground information
CN117611877B (en) * 2023-10-30 2024-05-14 西安电子科技大学 LS-YOLO network-based remote sensing image landslide detection method
CN117611877A (en) * 2023-10-30 2024-02-27 西安电子科技大学 LS-YOLO network-based remote sensing image landslide detection method
CN117496666A (en) * 2023-11-16 2024-02-02 成都理工大学 Intelligent and efficient drowning rescue system and method
CN117541584B (en) * 2024-01-09 2024-04-02 中国飞机强度研究所 Mask rotation superposition full-machine test crack characteristic enhancement and identification method
CN117541584A (en) * 2024-01-09 2024-02-09 中国飞机强度研究所 Mask rotation superposition full-machine test crack characteristic enhancement and identification method
CN117854111A (en) * 2024-01-15 2024-04-09 江南大学 Improved YOLOv4 plasmodium detection method based on enhanced feature fusion

Similar Documents

Publication Publication Date Title
CN115205264A (en) High-resolution remote sensing ship detection method based on improved YOLOv4
CN112818903B (en) Small sample remote sensing image target detection method based on meta-learning and cooperative attention
CN108961235B (en) Defective insulator identification method based on YOLOv3 network and particle filter algorithm
CN110188705B (en) Remote traffic sign detection and identification method suitable for vehicle-mounted system
CN111695448B (en) Roadside vehicle identification method based on visual sensor
CN110796009A (en) Method and system for detecting marine vessel based on multi-scale convolution neural network model
CN112070070B (en) LW-CNN method and system for urban remote sensing scene recognition
CN115346177A (en) Novel system and method for detecting target under road side view angle
CN114332921A (en) Pedestrian detection method based on improved clustering algorithm for Faster R-CNN network
CN110503049B (en) Satellite video vehicle number estimation method based on generation countermeasure network
Sun et al. IRDCLNet: Instance segmentation of ship images based on interference reduction and dynamic contour learning in foggy scenes
Wang et al. Based on the improved YOLOV3 small target detection algorithm
CN112785610B (en) Lane line semantic segmentation method integrating low-level features
CN114882490B (en) Unlimited scene license plate detection and classification method based on point-guided positioning
CN116129327A (en) Infrared vehicle detection method based on improved YOLOv7 algorithm
CN113673534B (en) RGB-D image fruit detection method based on FASTER RCNN
CN110674687A (en) Robust and efficient unmanned pedestrian detection method
CN117274723B (en) Target identification method, system, medium and equipment for power transmission inspection
Wang et al. GSC-YOLOv5: An Algorithm based on Improved Attention Mechanism for Road Creak Detection
CN116777895B (en) Concrete bridge Liang Biaoguan disease intelligent detection method based on interpretable deep learning
CN116229381B (en) River and lake sand production ship face recognition method
CN117079142B (en) Anti-attention generation countermeasure road center line extraction method for automatic inspection of unmanned aerial vehicle
Chen et al. UAV remote sensing image Rural homestead detection based on deep learning
CN114118125A (en) Multi-modal input and space division three-dimensional target detection method
CN116229409A (en) Crosswalk line detection method introducing global features and multi-scale features

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination