CN114220015A - Improved YOLOv 5-based satellite image small target detection method - Google Patents
Improved YOLOv 5-based satellite image small target detection method Download PDFInfo
- Publication number
- CN114220015A CN114220015A CN202111567696.2A CN202111567696A CN114220015A CN 114220015 A CN114220015 A CN 114220015A CN 202111567696 A CN202111567696 A CN 202111567696A CN 114220015 A CN114220015 A CN 114220015A
- Authority
- CN
- China
- Prior art keywords
- network
- layer
- detection
- feature
- convolution
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 49
- 238000000034 method Methods 0.000 claims abstract description 21
- 230000000694 effects Effects 0.000 claims abstract description 18
- 238000000605 extraction Methods 0.000 claims abstract description 12
- 230000009286 beneficial effect Effects 0.000 claims abstract description 3
- 238000010586 diagram Methods 0.000 claims description 16
- 238000005070 sampling Methods 0.000 claims description 14
- 230000004927 fusion Effects 0.000 claims description 10
- 238000010606 normalization Methods 0.000 claims description 8
- 230000003044 adaptive effect Effects 0.000 claims description 3
- 230000008030 elimination Effects 0.000 claims description 3
- 238000003379 elimination reaction Methods 0.000 claims description 3
- 230000007246 mechanism Effects 0.000 claims description 3
- 230000008859 change Effects 0.000 claims description 2
- 230000005764 inhibitory process Effects 0.000 claims description 2
- 238000013507 mapping Methods 0.000 claims description 2
- 230000001629 suppression Effects 0.000 claims description 2
- 230000002776 aggregation Effects 0.000 abstract description 2
- 238000004220 aggregation Methods 0.000 abstract description 2
- 238000005516 engineering process Methods 0.000 description 5
- 238000013135 deep learning Methods 0.000 description 4
- 210000002569 neuron Anatomy 0.000 description 3
- 230000008569 process Effects 0.000 description 3
- 238000011161 development Methods 0.000 description 2
- 238000013473 artificial intelligence Methods 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000003064 k means clustering Methods 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 238000011084 recovery Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000035945 sensitivity Effects 0.000 description 1
- 238000012706 support-vector machine Methods 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
- G06F18/253—Fusion techniques of extracted features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Software Systems (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Mathematical Physics (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a satellite image small target detection method based on improved YOLOv 5. The invention has certain universality in the small target detection direction, and the patent takes the remote sensing image small target detection as an explanatory case. In order to solve the problems of false detection, missing detection, insufficient feature extraction capability and the like of small targets in remote sensing image target detection, a small target detection algorithm based on improved YOLOv5 is provided. The algorithm uses a Mosaic-6 method to enhance data, and adjusts a loss function by replacing a backbone network with a Swin transform structure with stronger feature extraction capability, so that the network can capture global information and rich context information; by modifying the network neck structure, namely introducing the CBAM attention module into the feature pyramid and the path aggregation network, the method is beneficial to the network to adaptively refine the features of the intermediate feature map, and further improves the detection effect of the network model on the small target. The improved algorithm is applied to remote sensing image detection under the dense small target scene, and experimental results show that compared with the original YOLOv5 algorithm, the algorithm has stronger feature extraction capability and higher detection accuracy on small target detection.
Description
Technical Field
The invention relates to the field of target detection in deep learning, in particular to a remote sensing image target detection technology aiming at small target detection.
Background
In the field of remote sensing, the resolution of satellite images is generally large, with a large number of small target objects present. Because these targets are small in size and low in resolution relative to the whole image, it is difficult to accurately detect existing small targets when the targets are quickly detected and identified. With the continuous maturity of deep learning technology, more and more target detection methods are used for remote sensing images. Due to the existence of a plurality of target objects in the image, it is very challenging to detect and locate a small target from the image, and a large amount of false detection and missed detection exist in the implementation process, so that the overall detection effect is affected. Therefore, it is one of the hot spots in the field of artificial intelligence development to research the small target detection technology in the remote sensing image.
In order to accurately detect weak and small targets in a remote sensing image, a common detection method includes: a Haar classifier; gradient orientation histogram and support vector machine method (HOG + SVM); deformable partial model techniques (DPM); a method based on a deep neural network. The Haar classifier cascades strong classifiers trained by an AdaBoost algorithm, and adopts a high-efficiency rectangular feature and integral graph method in the bottom-layer feature extraction, but because the original features contain less context information, more high-frequency features cannot be extracted and the target to be detected can not be effectively identified. The histogram of gradient feature (HOG) is a dense descriptor for the local overlapping region of the image, and it constitutes the feature by calculating the histogram of gradient direction of the local region, and uses the histogram of gradient feature in combination with the SVM classifier to detect the target, but the histogram of gradient method has the problems of long descriptor generation process, difficult processing of dense target, and the like, and has the disadvantage of being quite sensitive to noise data. The Deformable Partial Model (DPM) method can be regarded as an upgraded version of a gradient histogram and an SVM classifier, but the DPM structure is relatively complex, the detection speed is relatively slow, and meanwhile, a good detection effect cannot be shown on a target in a complex scene.
With continuous progress and rapid development of deep learning, the application of the method in the field of remote sensing images is more and more extensive, and particularly in the field of target detection, excellent target detection frameworks such as YOLO, RCNN and SSD appear, but the method is always a difficult problem in the field of target detection for small target detection. The invention aims to solve the problem caused by a large number of small targets in the remote sensing image. The method has certain universality in the field of small target detection, and improves the data enhancement module aiming at the problem that small targets exist in the image, so that the network can learn and extract more tiny details.
Disclosure of Invention
In order to overcome the defects of the prior art, the invention provides a small target detection technology based on improved YOLOv 5. The technology applies a high-performance universal target detection model YOLOv5 in deep learning, and further improves the YOLOv5 algorithm aiming at the problems of small targets, large distribution quantity and the like (as shown in figure 1) existing in remote sensing images.
The technical scheme adopted by the invention is as follows:
step 1: the input image enters a network, the self-adaptive calculation of an anchor frame is firstly carried out, the self-adaptive scaling of the image is secondly carried out, and the Mosaic-6 data enhancement is realized thirdly;
step 2: the feature extraction backbone network adopts a Swin-Transformer structure and comprises a first Focus slicing operation, a first downsampling layer, a second convolution normalization layer, a second downsampling layer, a third convolution normalization layer, a third downsampling layer, a fourth convolution normalization layer, a fourth downsampling layer, a fifth convolution normalization layer and a fifth downsampling layer;
and step 3: performing feature extraction on feature maps generated by the third to fifth downsampling layers in the step 2 by adopting convolution of 1x1, and marking the obtained feature maps as M3, M4 and M5 respectively;
and 4, step 4: the step is a traditional FPN network structure, and adopts a bottom-up path to carry out multi-scale target detection, so that the characteristics of the bottom layer are fused with the bottom layer information containing rich position information; the aliasing effect brought by the fusion of M5 after 3 multiplied by 3 convolution elimination is marked as P5; performing double upsampling on M5, adding the upsampled M4 pixel by pixel, and performing 3 × 3 convolution to eliminate an aliasing effect brought by fusion to generate a characteristic diagram which is marked as P4; m4 is up-sampled twice, added with M3 pixel by pixel, and subjected to 3 x3 convolution to eliminate aliasing effect brought by fusion to generate a feature map P3;
and 5: on the basis of the FPN in the step 4, adding a bottom-up path which is called a PAN (personal area network) so as to fuse the bottom-layer features and the high-layer features containing rich semantic information; taking the P3 as bottom layer feature A3, carrying out 2-time down-sampling, and then carrying out pixel-by-pixel addition on the obtained product and P4 to obtain a feature map A4; a4 is subjected to 2-time down-sampling, and then is subjected to pixel-by-pixel addition with P5 to generate a feature map A5; as with step 4, A3-a 5 are subjected to 3 × 3 convolution to eliminate aliasing effects, and final feature maps Q3-Q5 are generated;
step 6: the step is the core content of the patent, and a lightweight attention module (CBAM) is integrated after each upsampling in step 4, and attention maps are sequentially deduced along two independent dimensions of a channel and a space, and then multiplied by an input feature map to perform adaptive feature refinement; similarly, after each downsampling in the step 5, a CBAM module is added to learn or extract the weight distribution from the features, and the weight distribution is applied to the original features to change the distribution of the original features and enhance the features with effective features and ineffective inhibition;
and 7: respectively inputting the characteristic graphs Q3-Q5 into a yolo detection head network, and predicting the Anchor setting of the network according to the clustering of a data set in advance; and then mapping the candidate frames output by the prediction network through non-maximum suppression into the size of the original image, selecting the target object in the image by the frames, and finally obtaining the detection result.
Compared with the prior art, the invention has the beneficial effects that:
(1) on the detection of a satellite image small target, higher identification precision can be achieved;
(2) for intensive target detection, a better detection effect can be shown.
Drawings
FIG. 1 is a diagram: typical small target schematic diagram of remote sensing image.
FIG. 2 is a diagram of: the flow chart of the Mosaic data enhancement is shown.
FIG. 3 is a diagram of: the detail of the Mosaic-6 data enhancement is shown schematically.
FIG. 4 is a diagram of: raw YOLOv5 feature extraction model schematic.
FIG. 5 is a diagram: and (5) extracting the receptive fields of each layer of the network by using the characteristics.
FIG. 6 is a diagram of: adding a Swin Transformer network sampling schematic diagram.
FIG. 7 is a diagram of: the improved characteristic is converged with a network schematic diagram.
FIG. 8 is a diagram of: anchor size schematic in original YOLOv 5.
FIG. 9 is a diagram of: and (3) detecting the effect of the improved algorithm model on the small target of the image.
Detailed Description
The invention is further described below with reference to the accompanying drawings.
First, a process of extracting features of a remote sensing image by using the YOLOv5 network model CSPNet is shown in fig. 4. The size of an input image is 640x640x3, the image firstly passes through a Focus structure, a slicing operation is carried out, namely, the size of the image is reduced, the number of channels is increased, and the size of a feature map is changed into 320x320x 32; after the second convolution operation, the size of the feature map becomes 160x160x 64; after the third convolution operation, the size of the feature map becomes 80x80x 128; after the fourth convolution operation, the size of the feature map becomes 40x40x 256; after the fifth convolution operation, the size of the feature map becomes 20x20x 512.
In the convolution network for generating the feature map, the superposed precomputations of the neurons for generating the bottom-layer feature map are less, the receptive field on the original map is small, the detail information such as the edge and the texture of the image is more emphasized and reserved, the superposed precomputations of the neurons for generating the high-layer feature map are more, the receptive field on the original map is large, and the semantic information of the image is more emphasized and reserved. High-level features are downsampled many times, and more detailed information is generally ignored. FIG. 5 is a diagram of the receptive field condition rooted in the output profiles of the layers of the CSPNet.
The YOLOv5 uses the feature maps output after 8, 16 and 32 times down sampling to perform subsequent classification and regression tasks, that is, uses the feature maps with the receptive field size of 8, 16, 32 and other large, medium and small scales, while the small target in the remote sensing image generally has only a few pixels, and the semantic information extracted from the only few pixels by the network is very limited. In an extreme case, a small image target may correspond to only one point on the high-level feature map, so that detection of the small target requires more consideration of feature maps extracted by neurons with smaller receptive fields.
Then, the invention improves a YOLOv5 detection model, and introduces a Swin Transformer feature extraction backbone network. As shown in fig. 6, the image to be detected is extracted with features through a depth network, the concept of Windows Multi-Head Self-orientation (W-MSA) is used, for example, in the 4-fold down-sampling and 8-fold down-sampling in the figure, the feature map is divided into a plurality of disjoint areas (Windows), and the Multi-Head Self-orientation is performed only in each Window (Window). The calculated amount is reduced, and meanwhile, the information is transmitted in the adjacent window, so that the network can extract more detailed information of the target.
Detailed description of the invention
(1) The input end preprocesses the image, is inspired by the Mosaic thought, adopts an enhanced version of the Mosaic method, namely Mosaic-6, namely 6 pictures are adopted to be cut, randomly arranged and randomly scaled, and then are combined into one picture, so that the data volume of the sample is increased, random noises are reasonably introduced, the discrimination of the network model on small target samples in the image is enhanced, and the generalization capability of the model is improved.
(2) The aliasing effect brought by the fusion of M5 after 3 multiplied by 3 convolution elimination is marked as P5; performing double upsampling on M5, adding the upsampled M4 pixel by pixel, and performing 3 × 3 convolution to eliminate an aliasing effect brought by fusion to generate a characteristic diagram which is marked as P4; m4 is up-sampled twice, added with M3 pixel by pixel, and subjected to 3 x3 convolution to eliminate aliasing effect brought by fusion to generate a feature map P3;
(3) on the basis, a bottom-up path is added, called as a PAN (personal area network), so that the bottom-layer features and the high-layer features containing rich semantic information are fused; taking the P3 as bottom layer feature A3, carrying out 2-time down-sampling, and then carrying out pixel-by-pixel addition on the obtained product and P4 to obtain a feature map A4; a4 is subjected to 2-time down-sampling, and then is subjected to pixel-by-pixel addition with P5 to generate a feature map A5; as with step 4, A3-a 5 are subjected to 3 × 3 convolution to eliminate aliasing effects, and final feature maps Q3-Q5 are generated;
the improvement has two advantages, on one hand, the model fully utilizes low-level features containing abundant detail information to detect small targets; on the other hand, the deep semantic features are transmitted from top to bottom by the feature pyramid network, the position information of the target is transmitted from bottom to top by the path aggregation network, the feature is better learned by the model through the fusion of the feature information from top to bottom and from bottom to top, and the sensitivity of the model to small targets and shielding targets is enhanced.
In the original YOLOv5, after the down-sampling step is completed, the sampling operation is continued, so that part of good characteristic information is lost; similarly, if the sampling operation is directly performed after the up-sampling step is completed, some features may be lost, resulting in incomplete recovery. Therefore, for the sample operation followed by the attention mechanism module (CBAM), given a feature map, CBAM will infer the attention map in turn along two independent dimensions, channel and space, and then multiply the attention map with the input feature map to perform adaptive feature refinement, preserving more features for the next convolution operation.
The setting of the Anchor can greatly affect the detection precision and the convergence speed of the model, the default Anchor aspect ratio and size are set for COCO data set verification in YOLOv5, and the Anchor should be designed in consideration of the actual size of the detected target. The detected object is a remote sensing image small target, more small targets and a target with a ratio of 1:1 exist, so the size and the aspect ratio parameters of the Anchor are automatically set according to the specific distribution condition of the target to be detected in the data set.
The invention carries out Anchor frame clustering on the remote sensing image data set loaded into the network through a K-means clustering algorithm, automatically generates corresponding Anchor sizes, and sets anchors with different sizes for feature maps with different sizes by combining with the multi-scale detection scheme. This is equivalent to adding good prior information, and the difficulty of frame regression can be reduced to a certain extent.
Fig. 9 shows the detection effect of the improved YOLOv5 algorithm on the target to be detected, and it can be found that the algorithm model can accurately detect the small target in the image, and the problems of false detection, missed detection, and the like are well solved.
On the basis of the original Yolov5 algorithm, the method is improved and optimized from three aspects of Mosaic data enhancement, feature extraction backbone network, attention mechanism and the like, and effectively enhances the detection precision of a Yolov5 network model on small target objects.
While the invention has been described with reference to specific embodiments, any feature disclosed in this specification may be replaced by alternative features serving the same, equivalent or similar purpose, unless expressly stated otherwise; all of the disclosed features, or all of the method or process steps, may be combined in any combination, except combinations where mutually exclusive features or steps are present.
Claims (4)
1. A satellite image small target detection method based on improved YOLOv5 is characterized by comprising the following steps:
step 1: the input image enters a network, the self-adaptive scaling of the image is firstly carried out, and the Mosaic-6 data enhancement is secondly realized;
step 2: the feature extraction backbone network adopts a Swin-Transformer structure and comprises a first Focus slicing operation, a first downsampling layer, a second convolution normalization layer, a second downsampling layer, a third convolution normalization layer, a third downsampling layer, a fourth convolution normalization layer, a fourth downsampling layer, a fifth convolution normalization layer and a fifth downsampling layer;
and step 3: performing feature extraction on feature maps generated by the third to fifth downsampling layers in the step 2 by adopting convolution of 1x1, and marking the obtained feature maps as M3, M4 and M5 respectively;
and 4, step 4: the step is a traditional FPN network structure, and adopts a bottom-up path to carry out multi-scale target detection, so that the characteristics of the bottom layer are fused with the bottom layer information containing rich position information; the aliasing effect brought by the fusion of M5 after 3 multiplied by 3 convolution elimination is marked as P5; performing double upsampling on M5, adding the upsampled M4 pixel by pixel, and performing 3 × 3 convolution to eliminate an aliasing effect brought by fusion to generate a characteristic diagram which is marked as P4; m4 is up-sampled twice, added with M3 pixel by pixel, and subjected to 3 x3 convolution to eliminate aliasing effect brought by fusion to generate a feature map P3;
and 5: on the basis of the FPN in the step 4, adding a bottom-up path which is called a PAN (personal area network) so as to fuse the bottom-layer features and the high-layer features containing rich semantic information; taking the P3 as bottom layer feature A3, carrying out 2-time down-sampling, and then carrying out pixel-by-pixel addition on the obtained product and P4 to obtain a feature map A4; a4 is subjected to 2-time down-sampling, and then is subjected to pixel-by-pixel addition with P5 to generate a feature map A5; as with step 4, A3-a 5 are subjected to 3 × 3 convolution to eliminate aliasing effects, and final feature maps Q3-Q5 are generated;
step 6: the step is the core content of the patent, and a lightweight attention module (CBAM) is integrated after each upsampling in step 4, and attention maps are sequentially deduced along two independent dimensions of a channel and a space, and then multiplied by an input feature map to perform adaptive feature refinement; similarly, after each downsampling in the step 5, a CBAM module is added to learn or extract the weight distribution from the features, and the weight distribution is applied to the original features to change the distribution of the original features and enhance the features with effective features and ineffective inhibition;
and 7: respectively inputting the characteristic graphs Q3-Q5 into a yolo detection head network, and predicting the Anchor setting of the network according to the clustering of a data set in advance; and then mapping the candidate frames output by the prediction network through non-maximum suppression into the size of the original image, selecting the target object in the image by the frames, and finally obtaining the detection result.
2. The method of claim 1, wherein the Mosaic-6 data enhancement module in step 1 combines 6 pictures into one picture after random cropping, random arrangement and random scaling.
3. The method of claim 1, wherein the SwinTransformer backbone network in step 2 has greater feature extraction capability.
4. The method as claimed in claim 1, wherein after the lightweight attention mechanism module (CBAM) in step 6 is introduced into the convolutional layer, the features can be covered on more parts of the object to be identified, so that the probability of identifying the object becomes higher, and it is beneficial for the network to focus on the key information and find the region of interest.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111567696.2A CN114220015A (en) | 2021-12-21 | 2021-12-21 | Improved YOLOv 5-based satellite image small target detection method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111567696.2A CN114220015A (en) | 2021-12-21 | 2021-12-21 | Improved YOLOv 5-based satellite image small target detection method |
Publications (1)
Publication Number | Publication Date |
---|---|
CN114220015A true CN114220015A (en) | 2022-03-22 |
Family
ID=80704553
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111567696.2A Pending CN114220015A (en) | 2021-12-21 | 2021-12-21 | Improved YOLOv 5-based satellite image small target detection method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114220015A (en) |
Cited By (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114677504A (en) * | 2022-05-30 | 2022-06-28 | 深圳市爱深盈通信息技术有限公司 | Target detection method, device, equipment terminal and readable storage medium |
CN114677362A (en) * | 2022-04-08 | 2022-06-28 | 四川大学 | Surface defect detection method based on improved YOLOv5 |
CN114913428A (en) * | 2022-04-26 | 2022-08-16 | 哈尔滨理工大学 | Remote sensing image target detection system based on deep learning |
CN114998759A (en) * | 2022-05-27 | 2022-09-02 | 电子科技大学 | High-precision SAR ship detection method based on visual transform |
CN115273017A (en) * | 2022-04-29 | 2022-11-01 | 桂林电子科技大学 | Traffic sign detection recognition model training method and system based on Yolov5 |
CN115272987A (en) * | 2022-07-07 | 2022-11-01 | 淮阴工学院 | MSA-yolk 5-based vehicle detection method and device in severe weather |
CN115294483A (en) * | 2022-09-28 | 2022-11-04 | 山东大学 | Small target identification method and system for complex scene of power transmission line |
CN116109966A (en) * | 2022-12-19 | 2023-05-12 | 中国科学院空天信息创新研究院 | Remote sensing scene-oriented video large model construction method |
CN116152591A (en) * | 2022-11-25 | 2023-05-23 | 中山大学 | Model training method, infrared small target detection method and device and electronic equipment |
CN116385903A (en) * | 2023-05-29 | 2023-07-04 | 哈尔滨工业大学(深圳)(哈尔滨工业大学深圳科技创新研究院) | Anti-distortion on-orbit target detection method and model for 1-level remote sensing data |
CN117274957A (en) * | 2023-11-23 | 2023-12-22 | 西南交通大学 | Road traffic sign detection method and system based on deep learning |
CN117671509A (en) * | 2024-02-02 | 2024-03-08 | 武汉卓目科技有限公司 | Remote sensing target detection method and device, electronic equipment and storage medium |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111179217A (en) * | 2019-12-04 | 2020-05-19 | 天津大学 | Attention mechanism-based remote sensing image multi-scale target detection method |
CN111914917A (en) * | 2020-07-22 | 2020-11-10 | 西安建筑科技大学 | Target detection improved algorithm based on feature pyramid network and attention mechanism |
CN112465752A (en) * | 2020-11-16 | 2021-03-09 | 电子科技大学 | Improved Faster R-CNN-based small target detection method |
WO2021139069A1 (en) * | 2020-01-09 | 2021-07-15 | 南京信息工程大学 | General target detection method for adaptive attention guidance mechanism |
CN113361428A (en) * | 2021-06-11 | 2021-09-07 | 浙江澄视科技有限公司 | Image-based traffic sign detection method |
WO2021244079A1 (en) * | 2020-06-02 | 2021-12-09 | 苏州科技大学 | Method for detecting image target in smart home environment |
-
2021
- 2021-12-21 CN CN202111567696.2A patent/CN114220015A/en active Pending
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111179217A (en) * | 2019-12-04 | 2020-05-19 | 天津大学 | Attention mechanism-based remote sensing image multi-scale target detection method |
WO2021139069A1 (en) * | 2020-01-09 | 2021-07-15 | 南京信息工程大学 | General target detection method for adaptive attention guidance mechanism |
WO2021244079A1 (en) * | 2020-06-02 | 2021-12-09 | 苏州科技大学 | Method for detecting image target in smart home environment |
CN111914917A (en) * | 2020-07-22 | 2020-11-10 | 西安建筑科技大学 | Target detection improved algorithm based on feature pyramid network and attention mechanism |
CN112465752A (en) * | 2020-11-16 | 2021-03-09 | 电子科技大学 | Improved Faster R-CNN-based small target detection method |
CN113361428A (en) * | 2021-06-11 | 2021-09-07 | 浙江澄视科技有限公司 | Image-based traffic sign detection method |
Non-Patent Citations (2)
Title |
---|
周幸;陈立福;: "基于双注意力机制的遥感图像目标检测", 计算机与现代化, no. 08, 15 August 2020 (2020-08-15) * |
麻森权;周克;: "基于注意力机制和特征融合改进的小目标检测算法", 计算机应用与软件, no. 05, 12 May 2020 (2020-05-12) * |
Cited By (20)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114677362A (en) * | 2022-04-08 | 2022-06-28 | 四川大学 | Surface defect detection method based on improved YOLOv5 |
CN114677362B (en) * | 2022-04-08 | 2023-09-12 | 四川大学 | Surface defect detection method based on improved YOLOv5 |
CN114913428A (en) * | 2022-04-26 | 2022-08-16 | 哈尔滨理工大学 | Remote sensing image target detection system based on deep learning |
CN115273017A (en) * | 2022-04-29 | 2022-11-01 | 桂林电子科技大学 | Traffic sign detection recognition model training method and system based on Yolov5 |
CN114998759A (en) * | 2022-05-27 | 2022-09-02 | 电子科技大学 | High-precision SAR ship detection method based on visual transform |
CN114677504B (en) * | 2022-05-30 | 2022-11-15 | 深圳市爱深盈通信息技术有限公司 | Target detection method, device, equipment terminal and readable storage medium |
CN114677504A (en) * | 2022-05-30 | 2022-06-28 | 深圳市爱深盈通信息技术有限公司 | Target detection method, device, equipment terminal and readable storage medium |
CN115272987B (en) * | 2022-07-07 | 2023-08-22 | 淮阴工学院 | MSA-Yolov 5-based vehicle detection method and device in severe weather |
CN115272987A (en) * | 2022-07-07 | 2022-11-01 | 淮阴工学院 | MSA-yolk 5-based vehicle detection method and device in severe weather |
CN115294483A (en) * | 2022-09-28 | 2022-11-04 | 山东大学 | Small target identification method and system for complex scene of power transmission line |
CN116152591B (en) * | 2022-11-25 | 2023-11-07 | 中山大学 | Model training method, infrared small target detection method and device and electronic equipment |
CN116152591A (en) * | 2022-11-25 | 2023-05-23 | 中山大学 | Model training method, infrared small target detection method and device and electronic equipment |
CN116109966B (en) * | 2022-12-19 | 2023-06-27 | 中国科学院空天信息创新研究院 | Remote sensing scene-oriented video large model construction method |
CN116109966A (en) * | 2022-12-19 | 2023-05-12 | 中国科学院空天信息创新研究院 | Remote sensing scene-oriented video large model construction method |
CN116385903A (en) * | 2023-05-29 | 2023-07-04 | 哈尔滨工业大学(深圳)(哈尔滨工业大学深圳科技创新研究院) | Anti-distortion on-orbit target detection method and model for 1-level remote sensing data |
CN116385903B (en) * | 2023-05-29 | 2023-09-19 | 哈尔滨工业大学(深圳)(哈尔滨工业大学深圳科技创新研究院) | Anti-distortion on-orbit target detection method and model for 1-level remote sensing data |
CN117274957A (en) * | 2023-11-23 | 2023-12-22 | 西南交通大学 | Road traffic sign detection method and system based on deep learning |
CN117274957B (en) * | 2023-11-23 | 2024-03-01 | 西南交通大学 | Road traffic sign detection method and system based on deep learning |
CN117671509A (en) * | 2024-02-02 | 2024-03-08 | 武汉卓目科技有限公司 | Remote sensing target detection method and device, electronic equipment and storage medium |
CN117671509B (en) * | 2024-02-02 | 2024-05-24 | 武汉卓目科技有限公司 | Remote sensing target detection method and device, electronic equipment and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN114220015A (en) | Improved YOLOv 5-based satellite image small target detection method | |
CN107844779B (en) | Video key frame extraction method | |
CN109684925B (en) | Depth image-based human face living body detection method and device | |
Zhou et al. | Robust vehicle detection in aerial images using bag-of-words and orientation aware scanning | |
CN108062525B (en) | Deep learning hand detection method based on hand region prediction | |
CN110929593B (en) | Real-time significance pedestrian detection method based on detail discrimination | |
CN110263712B (en) | Coarse and fine pedestrian detection method based on region candidates | |
TW201926140A (en) | Method, electronic device and non-transitory computer readable storage medium for image annotation | |
Lee et al. | SNIDER: Single noisy image denoising and rectification for improving license plate recognition | |
CN111353544B (en) | Improved Mixed Pooling-YOLOV 3-based target detection method | |
EP2864933A1 (en) | Method, apparatus and computer program product for human-face features extraction | |
CN113591968A (en) | Infrared weak and small target detection method based on asymmetric attention feature fusion | |
Danelljan et al. | Deep motion and appearance cues for visual tracking | |
CN109035300B (en) | Target tracking method based on depth feature and average peak correlation energy | |
CN112950477A (en) | High-resolution saliency target detection method based on dual-path processing | |
Lu et al. | Learning attention map from images | |
CN112785480B (en) | Image splicing tampering detection method based on frequency domain transformation and residual error feedback module | |
Xu et al. | Dktnet: dual-key transformer network for small object detection | |
CN112580480A (en) | Hyperspectral remote sensing image classification method and device | |
Xu et al. | LMO-YOLO: A ship detection model for low-resolution optical satellite imagery | |
Fan et al. | A novel sonar target detection and classification algorithm | |
Luo et al. | Weakly supervised learning for raindrop removal on a single image | |
Xu et al. | COCO-Net: A dual-supervised network with unified ROI-loss for low-resolution ship detection from optical satellite image sequences | |
Li et al. | SKRWM based descriptor for pedestrian detection in thermal images | |
CN116168328A (en) | Thyroid nodule ultrasonic inspection system and method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |