CN112801158A - Deep learning small target detection method and device based on cascade fusion and attention mechanism - Google Patents
Deep learning small target detection method and device based on cascade fusion and attention mechanism Download PDFInfo
- Publication number
- CN112801158A CN112801158A CN202110081771.8A CN202110081771A CN112801158A CN 112801158 A CN112801158 A CN 112801158A CN 202110081771 A CN202110081771 A CN 202110081771A CN 112801158 A CN112801158 A CN 112801158A
- Authority
- CN
- China
- Prior art keywords
- fusion
- feature
- cascade
- attention mechanism
- target
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 230000004927 fusion Effects 0.000 title claims abstract description 97
- 238000001514 detection method Methods 0.000 title claims abstract description 85
- 230000007246 mechanism Effects 0.000 title claims abstract description 53
- 238000013135 deep learning Methods 0.000 title claims abstract description 28
- 238000000034 method Methods 0.000 claims abstract description 23
- 238000012805 post-processing Methods 0.000 claims abstract description 17
- 238000007781 pre-processing Methods 0.000 claims abstract description 15
- 238000013527 convolutional neural network Methods 0.000 claims description 20
- 238000000605 extraction Methods 0.000 claims description 13
- 238000010606 normalization Methods 0.000 claims description 8
- 230000004913 activation Effects 0.000 claims description 6
- 238000004590 computer program Methods 0.000 claims description 6
- 230000006870 function Effects 0.000 claims description 6
- 238000010586 diagram Methods 0.000 claims description 5
- 238000013528 artificial neural network Methods 0.000 abstract description 2
- 239000010410 layer Substances 0.000 description 42
- 238000004422 calculation algorithm Methods 0.000 description 6
- 238000010191 image analysis Methods 0.000 description 4
- 230000004044 response Effects 0.000 description 4
- 230000006835 compression Effects 0.000 description 3
- 238000007906 compression Methods 0.000 description 3
- 238000003384 imaging method Methods 0.000 description 3
- 238000012544 monitoring process Methods 0.000 description 3
- 230000008447 perception Effects 0.000 description 3
- 230000009286 beneficial effect Effects 0.000 description 2
- 201000010099 disease Diseases 0.000 description 2
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 230000007613 environmental effect Effects 0.000 description 2
- 239000000284 extract Substances 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 239000002356 single layer Substances 0.000 description 2
- 230000000007 visual effect Effects 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 230000007423 decrease Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000002708 enhancing effect Effects 0.000 description 1
- 230000002401 inhibitory effect Effects 0.000 description 1
- 238000007726 management method Methods 0.000 description 1
- 239000003595 mist Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000011176 pooling Methods 0.000 description 1
- 230000002265 prevention Effects 0.000 description 1
- 230000001737 promoting effect Effects 0.000 description 1
- 230000001629 suppression Effects 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Chemical compound O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
- G06F18/253—Fusion techniques of extracted features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V2201/00—Indexing scheme relating to image or video recognition or understanding
- G06V2201/07—Target detection
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Software Systems (AREA)
- Mathematical Physics (AREA)
- Computing Systems (AREA)
- Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Evolutionary Biology (AREA)
- Multimedia (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a method and a device for detecting a deep learning small target based on cascade fusion and an attention mechanism, wherein the method comprises the following steps: s1, inputting an image to be detected, and preprocessing the image to be detected; s2, extracting features of the preprocessed image by using a deep convolution neural network based on cascade fusion and an attention mechanism to obtain target image features, performing feature fusion by a feature cascade fusion layer based on a cascade feature fusion structure, acquiring semantic masks of small target areas by a space attention mechanism layer, fusing the semantic masks with original features channel by channel, and outputting the extracted target image features; and S3, predicting and post-processing the extracted target image characteristics to obtain a final target detection result and outputting the final target detection result. The invention can realize small target detection based on deep learning, and has the advantages of simple realization method, low cost, high detection efficiency and precision, flexible operation and the like.
Description
Technical Field
The invention relates to the technical field of small target detection in a small unmanned platform, in particular to a deep learning small target detection method and device based on cascade fusion and an attention mechanism.
Background
In a standard test dataset, large targets and medium targets typically occupy a large proportion. Some current detection algorithms can achieve better detection effect on large and medium targets. However, the environment in practical application scenarios is often more complex, and the number of small targets will increase dramatically under the condition that the imaging device is far away from the target or the size of the target itself is small, so that the research on small target detection has great application potential in many practical problems. If in automatic driving, vehicles, signal lamps and signs at a distance are accurately found in advance, which is beneficial to expanding the perception range of a visual system so as to make path planning in advance; in medical image analysis, the method is beneficial to assisting doctors to find the diseases which are not easy to be perceived or are slight, and the time for determining the diseases is advanced; the system can be used for improving the monitoring accuracy of traffic flow and pedestrian flow in a video monitoring system, and providing help for personnel and vehicle management; in the fields of high-altitude remote sensing image and satellite image analysis, the acquisition equipment is far away from the target, the target image is often small, and the accuracy of small target detection is particularly important.
Small-size unmanned platform carries visible light sensor usually like unmanned aerial vehicle, unmanned car, unmanned ship etc. can realize carrying out intelligent image reconnaissance and scene perception to the regional within range of target in fields such as agricultural plant protection, forest fire prevention, and has advantages such as use flexibility, high performance-price ratio, environmental suitability are good. However, due to the long distance between the target and the platform and the interference of the complex background environment and the mist cloud, the target detection realized by using the small unmanned platform has the following problems:
(1) the target size is small. The image sensor can acquire an image with 1920 × 1080 pixels, but due to a long imaging distance or a small size of the target, the width and height of the target on the image are usually only tens or even tens of pixels. When the detection is performed by a method such as deep learning, the original image is usually further compressed in order to reduce the amount of calculation and ensure the detection speed, which results in more information loss. How to effectively detect small targets has major challenges.
(2) The texture information is weak. The targets shot at a short distance are different, enough texture information is difficult to keep in the visual image at a long distance, and the number of effective feature points is extremely small. In addition, interference from environmental factors such as surface fog can also result in poor imaging quality. Due to the detection task based on the visible light image, two tasks of positioning and identifying the target need to be completed, and the weak texture also causes great trouble to the identification of the target, so that the target is difficult to accurately identify.
With the development of deep learning methods, the application potential of small target detection will be more and more prominent in the future, and the method has extremely high research value in applications such as automatic driving, medical image analysis, video monitoring and remote sensing image analysis. In a traditional scheme for realizing target detection based on a deep learning method, four steps of image preprocessing, feature extraction, classification regression and post-processing are usually included, the feature extraction is realized by adopting a deep convolutional neural network, wherein the hierarchical features from shallow to deep are used, the perception fields corresponding to feature maps of different levels in the deep convolutional neural network are different, the information abstraction degrees expressed by the characteristic maps are different, large and medium targets can easily obtain enough information from a single-layer feature map of a higher level, but small targets hardly obtain good detection effects from the information of the single-layer feature map, so that the hierarchical features from shallow to deep in the traditional deep convolutional neural network are difficult to adapt to the requirements of small target detection.
In summary, currently, detection is usually achieved for targets with large or medium sizes, and there is no method that can accurately and quickly achieve small-size target detection smaller than a conventional size, so it is desirable to provide a deep learning small target detection method, so that efficient small target detection can be achieved based on the deep learning method.
Disclosure of Invention
The technical problem to be solved by the invention is as follows: aiming at the technical problems in the prior art, the invention provides the method and the device for detecting the small deep learning target based on the cascade fusion and the attention mechanism, which have the advantages of simple realization method, low cost, high detection efficiency and precision and flexible operation.
In order to solve the technical problems, the technical scheme provided by the invention is as follows:
a deep learning small target detection method based on cascade fusion and attention mechanism comprises the following steps:
s1, image preprocessing: inputting an image to be detected, and preprocessing the input image to be detected to obtain a preprocessed image;
s2, target feature extraction: performing feature extraction on the preprocessed image by using a deep convolutional neural network based on cascade fusion and an attention mechanism, wherein the deep convolutional neural network is sequentially provided with a multi-size target detection layer, a feature cascade fusion layer and a space attention mechanism layer, the feature cascade fusion layer performs feature fusion based on a cascade feature fusion structure, the space attention mechanism layer acquires a semantic mask for responding to a small target region with a specified size, performs channel-by-channel fusion with original features, and outputs extracted target image features;
s3, target detection and post-processing: and predicting and post-processing the extracted target image characteristics to obtain a final target detection result and outputting the final target detection result.
Further, in step S1, the size of the input image to be detected is compressed according to the structure of the deep convolutional neural network, and the compressed data is subjected to preprocessing operations of normalization and defogging.
Further, in step S2, a neighboring scale cascade feature fusion structure is specifically used for feature fusion, and the neighboring scale cascade feature fusion structure performs feature fusion on a feature corresponding to a target layer of a neighboring scale, and then fuses the feature with a feature corresponding to a layer lower than the target layer, so as to extract the target image feature.
Furthermore, a feature map with a preset size is also arranged in the adjacent scale cascade feature fusion structure, so as to realize the detection of detail semantic information in the small target detection.
Furthermore, the adjacent scale cascade feature fusion structure adopts the grouping convolution with the same size so as to retain the differential information between feature maps of different sources.
Further, the spatial attention mechanism layer specifically learns a semantic mask for responding to a small target area with a specified size, and fuses the semantic mask and the output features in a channel-by-channel element-by-element multiplication mode.
Further, the spatial attention mechanism layer comprises a convolution layer, a batch normalization layer and an activation function layer which are sequentially connected.
Further, the step of step S3 includes: selecting target areas with different sizes at each pixel point of the extracted multilayer characteristic diagram, performing regression on the target position and the category confidence of each target area by using convolution with specified size, and performing post-processing to obtain a final result for output.
A deep learning small target detection device based on cascade fusion and attention mechanism comprises:
the image preprocessing module is used for inputting an image to be detected and preprocessing the input image to be detected to obtain a preprocessed image;
the target feature extraction module is used for performing feature extraction on the preprocessed image by using a deep convolutional neural network based on cascade fusion and an attention mechanism, wherein the deep convolutional neural network is sequentially provided with a multi-size target detection layer, a feature cascade fusion layer and a space attention mechanism layer, the feature cascade fusion layer performs feature fusion based on a cascade feature fusion structure, the space attention mechanism layer acquires a semantic mask for responding to a small target region with a specified size, performs channel-by-channel fusion with original features, and outputs the extracted target image features;
and the target detection and post-processing module is used for predicting and post-processing the extracted target image characteristics to obtain a final target detection result and outputting the final target detection result.
A deep learning small target detection device based on cascade fusion and attention mechanism comprises a processor and a memory, wherein the memory is used for storing a computer program, the processor is used for executing the computer program, and the processor is used for executing the computer program to execute the method.
Compared with the prior art, the invention has the advantages that:
1. the method aims at the small-size target and the characteristic of extracting the characteristic based on the convolutional neural network, realizes the detection of the small-size target by combining the cascade fusion and the attention mechanism, can effectively improve the detection rate of the small-size target on the detection of the deep learning target, and can effectively improve the adaptability of the detection of the small-size target under the conditions of complex background and interference.
2. The invention further adopts the high-scale feature map for detection, enhances the semantic information of the bottom-layer feature map by using the adjacent scale cascade fusion mode, and can further promote the detail information and semantic quality of the high-size feature map for detecting the small target and enhance the information of the target area by combining the use of the high-scale feature map and the adjacent scale cascade fusion, thereby further promoting the detection capability of the small target.
3. The invention further enables the network model to independently learn how to highlight the target area information and inhibit the background area response under supervision by combining the attention network structure on the use space, thereby further improving the feature extraction quality and improving the detection performance of the small target.
Drawings
Fig. 1 is a schematic flow chart of an implementation process of the deep learning small target detection method based on cascade fusion and attention mechanism in the embodiment.
Fig. 2 is a schematic diagram of a network structure based on cascade fusion and attention mechanism in a specific application example of the invention.
FIG. 3 is a schematic diagram of a feature fusion module structure designed in a specific application example of the present invention.
Fig. 4 is a schematic structural diagram of a single-branch and double-branch attention module designed in a specific application example of the invention.
Detailed Description
The invention is further described below with reference to the drawings and specific preferred embodiments of the description, without thereby limiting the scope of protection of the invention.
As shown in fig. 1 to 4, the detailed steps of the deep learning small target detection method based on cascade fusion and attention mechanism in the present embodiment include:
step S1: image preprocessing: and inputting an image to be detected, and preprocessing the input image to be detected to obtain a preprocessed image.
Unlike the traditional target detection algorithm, the present embodiment first compresses the input image size to adapt to the input of the convolutional neural network structure, and performs normalization and defogging preprocessing operations to improve the adaptability of the model to the brightness variation and the foggy day scene. The defogging algorithm adopted in the embodiment is specifically a dark channel defogging algorithm.
Step S2: extracting target features based on cascade fusion and attention mechanism: and performing feature extraction on the preprocessed image by using a deep convolutional neural network based on cascade fusion and an attention mechanism, wherein a multi-size target detection layer, a feature cascade fusion layer and a space attention mechanism layer are sequentially arranged in the deep convolutional neural network, the feature cascade fusion layer performs feature fusion based on a cascade feature fusion structure, the space attention mechanism layer acquires a semantic mask for responding to a small target region with a specified size, performs channel-by-channel fusion with the original feature, and outputs the extracted target image feature.
For small-size targets, the high-level feature map contains rich semantic information and has a large receptive field, but the high-level feature map with rich semantic information cannot keep the detailed information of the small targets. After many times of convolution and pooling, the semantic information of the feature map becomes rich, but its resolution gradually decreases, and the detailed representation of small objects in the feature map is gradually erased. The position information of the bottom characteristic graph is accurate, but the bottom characteristic graph lacks high-level semantic representation, the receptive field is small, and the small target detection is difficult to be assisted by the context information of the small target. The small target detection has much larger simultaneous requirement on semantic and detail information than large and medium targets, and the smaller the target size is, the more obvious the contradiction problem between the semantic and detail information is. In the embodiment, the small target detection and the characteristics of the neural network are considered, a deep convolutional neural network based on cascade fusion and an attention mechanism is constructed, a characteristic cascade fusion layer performs characteristic fusion on high-layer and low-layer characteristics based on a cascade characteristic fusion structure, a space attention mechanism layer obtains semantic masks for responding to small target regions with specified sizes and performs channel-by-channel fusion with original characteristics, and the detection capability of the deep learning network on small-size targets can be effectively improved by combining the cascade fusion and the attention mechanism.
In a specific application example, see fig. 2, where the structure 1 is a backbone network, which is the same as the multi-scale target detector SSD structure, and the specific backbone network may adopt a modified VGG-16 network structure; structure 2 is a feature cascade fusion structure; the structure 3 is a space attention mechanism and consists of a continuous convolution layer, a batch normalization layer and an activation function layer; structure 4 is an extracted target image feature, where structures 2, 3 are configured in the manner described above, so that the model can improve small target detection performance compared to conventional methods.
In this embodiment, a neighboring scale cascade feature fusion structure is specifically adopted to perform feature fusion, and the neighboring scale cascade feature fusion structure performs feature fusion on a feature corresponding to a target layer of a neighboring scale after upsampling the feature corresponding to the target layer, and then performs fusion with a feature lower than the target layer, that is, the neighboring scale cascade feature fusion structure performs feature fusion on a high-level feature of the neighboring scale and a low-level feature of a lower level, and extracts the feature of the target image to obtain the feature of the target image
In this embodiment, the feature cascade fusion structure specifically fuses neighboring scale high-level features with a lower-level feature map after performing deconvolution upsampling, and further extracts features by using stacked convolutions of 1 × 1, 3 × 3, and 1 × 1. In a specific application example, referring to fig. 3, the feature cascade fusion structure performs fusion on a small-size feature map after upsampling by deconvolution, specifically performs fusion on 19 × 19 and 38 × 38 feature maps after upsampling with 38 × 38 and 75 × 75 feature maps, respectively, the upsampling mode adopts deconvolution, and the channel compression adopts 1 × 1 packet convolution.
In the adjacent scale cascade feature fusion structure of this embodiment, a feature map with a preset size is further provided, so as to implement detection of detail semantic information in small target detection. In a specific application embodiment, a higher-scale 75 x 75 feature map can be adopted for detection, semantic information of a bottom-layer feature map is enhanced in a neighboring scale cascade fusion mode, and through the use of the higher-scale feature map and the neighboring scale cascade fusion, detailed information and semantic quality of a high-size feature map for detecting a small target can be effectively improved.
In this embodiment, the adjacent scale cascade feature fusion structure specifically adopts the same size of packet convolution to retain the differential information between feature maps of different sources.
In step S2, the spatial attention mechanism layer specifically learns a semantic mask for responding to a small target region of a specified size, and fuses the semantic mask and the output feature by performing channel-by-channel element-by-element multiplication. The spatial attention mechanism specifically consists of 3 groups of continuous convolution layers, batch normalization layers and activation function layers. The spatial attention mechanism fuses the features by multiplying the spatial attention weight value and the output features channel by channel, so that the feature information of the target area can be enhanced.
In order to avoid serious information loss after the attention information is superimposed, which causes insufficient information and damages the subsequent learning process, the embodiment specifically uses the leak ReLU as the activation function, and adopts residual connection.
In a specific application example, refer to fig. 4, where fig. 4(a) corresponds to a single-branch spatial attention module and fig. 4(b) corresponds to a dual-branch spatial attention module. Through the attention network structure in space, the network model can autonomously learn how to highlight the target area information and inhibit the background area response under supervision, so that the feature extraction quality is improved, and the detection performance of small targets is improved.
In a specific application embodiment, the deep convolutional neural network may be implemented by performing improved configuration on the basis of a conventional convolutional neural network, and the specific configuration includes:
1. configuring an adjacent scale cascade feature fusion structure: (1) in the conventional convolutional network structure, a higher size is added
75, the feature map is used for target detection to adapt to the requirement of small target detection on detail semantic information; (2) performing deconvolution upsampling on the high-level features of adjacent scales, and fusing the upsampling with the bottom-level features; (3) the use of channel compression convolution all uses 1 × 1 packet convolution to retain the differential information between different source feature maps.
2. Configuring a spatial attention mechanism: the spatial attention module consists of a continuous channel compression convolution layer, a batch normalization layer and a nonlinear activation function layer to learn a semantic mask responding to a small target region. The semantic mask is fused with the original feature in a way of multiplying the original feature channel by channel element by element, so that the purposes of enhancing small target area response, inhibiting background response and highlighting a weak texture target area are achieved.
Step S3: target detection and post-processing: and predicting and post-processing the extracted target image characteristics to obtain a final target detection result and outputting the final target detection result.
In this embodiment, target regions with different sizes are selected at each pixel point of the extracted multilayer feature map, and the target positions and the category confidences of the target regions are regressed by using convolution with a specified size, and a final result is obtained and output after post-processing.
In a specific application embodiment, a plurality of default frames with different sizes are set at each pixel point of the extracted multilayer feature map, regression is directly carried out on the target position and the category confidence of each default frame by using 3-by-3 convolution, and a final result is output after post-processing. The post-processing algorithm may specifically employ a non-maxima suppression algorithm.
The method considers the problems of small target size, weak texture, complex background, water vapor cloud interference and the like existing in the unmanned platform and the like when the target is found, realizes the detection of the small target by combining the cascade fusion and the attention mechanism, can improve the detection rate of the deep learning target detection on the small target and the adaptability under the complex background and interference conditions, and can realize the small target detection with high efficiency, low cost, accurate detection result and flexible operation compared with the traditional method.
The above is only a preferred embodiment of the present invention, and the protection scope of the present invention is not limited to the above-mentioned embodiments, and all technical solutions belonging to the idea of the present invention belong to the protection scope of the present invention. It should be noted that modifications and embellishments within the scope of the invention may be made by those skilled in the art without departing from the principle of the invention.
Claims (10)
1. A deep learning small target detection method based on cascade fusion and attention mechanism is characterized by comprising the following steps:
s1, image preprocessing: inputting an image to be detected, and preprocessing the input image to be detected to obtain a preprocessed image;
s2, target feature extraction: performing feature extraction on the preprocessed image by using a deep convolutional neural network based on cascade fusion and an attention mechanism, wherein the deep convolutional neural network is sequentially provided with a multi-size target detection layer, a feature cascade fusion layer and a space attention mechanism layer, the feature cascade fusion layer performs feature fusion based on a cascade feature fusion structure, the space attention mechanism layer acquires a semantic mask for responding to a small target region with a specified size, performs channel-by-channel fusion with original features, and outputs extracted target image features;
s3, target detection and post-processing: and predicting and post-processing the extracted target image characteristics to obtain a final target detection result and outputting the final target detection result.
2. The cascade fusion and attention mechanism-based deep learning small target detection method according to claim 1, characterized in that: in step S1, the size of the input image to be detected is compressed according to the structure of the deep convolutional neural network, and the compressed data is subjected to preprocessing operations of normalization and defogging.
3. The cascade fusion and attention mechanism-based deep learning small target detection method according to claim 1, characterized in that: in the step S2, an adjacent scale cascade feature fusion structure is specifically used for feature fusion, and the adjacent scale cascade feature fusion structure performs feature fusion on a feature corresponding to a target layer of an adjacent scale after upsampling the feature, and performs fusion on the feature with a feature lower than the target layer, so as to extract the target image feature.
4. The cascade fusion and attention mechanism-based deep learning small target detection method according to claim 3, characterized in that: the adjacent scale cascade feature fusion structure is also provided with a feature map with a preset size, so as to realize the detection of detail semantic information in the small target detection.
5. The cascade fusion and attention mechanism-based deep learning small target detection method according to claim 3, characterized in that: the adjacent scale cascade feature fusion structure adopts the grouping convolution with the same size so as to keep the differential information between feature maps of different sources.
6. The method for detecting the deeply learned small targets based on the cascade fusion and the attention mechanism as claimed in any one of claims 1 to 5, wherein the spatial attention mechanism layer specifically learns a semantic mask for responding to a small target region with a specified size, and fuses the semantic mask and the output features by channel-by-channel element-by-element multiplication.
7. The method for detecting the deep learning small target based on the cascade fusion and the attention mechanism according to any one of claims 1-5, wherein the method comprises the following steps: the space attention mechanism layer comprises a convolution layer, a batch normalization layer and an activation function layer which are sequentially connected.
8. The method for detecting the deep learning small target based on the cascade fusion and the attention mechanism according to any one of claims 1 to 5, wherein the step S3 comprises the following steps: selecting target areas with different sizes at each pixel point of the extracted multilayer characteristic diagram, performing regression on the target position and the category confidence of each target area by using convolution with specified size, and performing post-processing to obtain a final result for output.
9. A deep learning small target detection device based on cascade fusion and attention mechanism is characterized by comprising:
the image preprocessing module is used for inputting an image to be detected and preprocessing the input image to be detected to obtain a preprocessed image;
the target feature extraction module is used for performing feature extraction on the preprocessed image by using a deep convolutional neural network based on cascade fusion and an attention mechanism, wherein the deep convolutional neural network is sequentially provided with a multi-size target detection layer, a feature cascade fusion layer and a space attention mechanism layer, the feature cascade fusion layer performs feature fusion based on a cascade feature fusion structure, the space attention mechanism layer acquires a semantic mask for responding to a small target region with a specified size, performs channel-by-channel fusion with original features, and outputs the extracted target image features;
and the target detection and post-processing module is used for predicting and post-processing the extracted target image characteristics to obtain a final target detection result and outputting the final target detection result.
10. A deep learning small target detection apparatus based on cascade fusion and attention mechanism, comprising a processor and a memory, the memory being configured to store a computer program, the processor being configured to execute the computer program, wherein the processor is configured to execute the computer program to perform the method according to any one of claims 1 to 8.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110081771.8A CN112801158A (en) | 2021-01-21 | 2021-01-21 | Deep learning small target detection method and device based on cascade fusion and attention mechanism |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110081771.8A CN112801158A (en) | 2021-01-21 | 2021-01-21 | Deep learning small target detection method and device based on cascade fusion and attention mechanism |
Publications (1)
Publication Number | Publication Date |
---|---|
CN112801158A true CN112801158A (en) | 2021-05-14 |
Family
ID=75811061
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110081771.8A Pending CN112801158A (en) | 2021-01-21 | 2021-01-21 | Deep learning small target detection method and device based on cascade fusion and attention mechanism |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112801158A (en) |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113298080A (en) * | 2021-07-26 | 2021-08-24 | 城云科技(中国)有限公司 | Target detection enhancement model, target detection method, target detection device and electronic device |
CN113436148A (en) * | 2021-06-02 | 2021-09-24 | 范加利 | Method and system for detecting critical points of ship-borne airplane wheel contour based on deep learning |
CN113486929A (en) * | 2021-06-17 | 2021-10-08 | 中国地质大学(武汉) | Rock slice image identification method based on residual shrinkage module and attention mechanism |
CN113723172A (en) * | 2021-06-11 | 2021-11-30 | 南京航空航天大学 | Fusion multi-level feature target detection method for weak and small targets of remote sensing images |
CN114494728A (en) * | 2022-02-10 | 2022-05-13 | 北京工业大学 | Small target detection method based on deep learning |
CN114529794A (en) * | 2022-04-20 | 2022-05-24 | 湖南大学 | Infrared and visible light image fusion method, system and medium |
CN114627002A (en) * | 2022-02-07 | 2022-06-14 | 华南理工大学 | Image defogging method based on self-adaptive feature fusion |
CN115100235A (en) * | 2022-08-18 | 2022-09-23 | 哈尔滨工业大学(深圳)(哈尔滨工业大学深圳科技创新研究院) | Target tracking method, system and storage medium |
CN115451939A (en) * | 2022-08-19 | 2022-12-09 | 中国人民解放军国防科技大学 | Parallel SLAM method based on detection segmentation in dynamic scene |
CN115984672A (en) * | 2023-03-17 | 2023-04-18 | 成都纵横自动化技术股份有限公司 | Method and device for detecting small target in high-definition image based on deep learning |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111832509A (en) * | 2020-07-21 | 2020-10-27 | 中国人民解放军国防科技大学 | Unmanned aerial vehicle weak and small target detection method based on space-time attention mechanism |
-
2021
- 2021-01-21 CN CN202110081771.8A patent/CN112801158A/en active Pending
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111832509A (en) * | 2020-07-21 | 2020-10-27 | 中国人民解放军国防科技大学 | Unmanned aerial vehicle weak and small target detection method based on space-time attention mechanism |
Non-Patent Citations (2)
Title |
---|
DENG JIANG等: "FASSD: A Feature Fusion and Spatial Attention-Based Single Shot Detector for Small Object Detection", 《ELECTRONICS 2020》 * |
麻森权等: "基于注意力机制和特征融合改进的小目标检测算法", 《计算机应用与软件》 * |
Cited By (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113436148A (en) * | 2021-06-02 | 2021-09-24 | 范加利 | Method and system for detecting critical points of ship-borne airplane wheel contour based on deep learning |
CN113723172A (en) * | 2021-06-11 | 2021-11-30 | 南京航空航天大学 | Fusion multi-level feature target detection method for weak and small targets of remote sensing images |
CN113486929A (en) * | 2021-06-17 | 2021-10-08 | 中国地质大学(武汉) | Rock slice image identification method based on residual shrinkage module and attention mechanism |
CN113486929B (en) * | 2021-06-17 | 2023-02-24 | 中国地质大学(武汉) | Rock slice image identification method based on residual shrinkage module and attention mechanism |
CN113298080A (en) * | 2021-07-26 | 2021-08-24 | 城云科技(中国)有限公司 | Target detection enhancement model, target detection method, target detection device and electronic device |
CN114627002A (en) * | 2022-02-07 | 2022-06-14 | 华南理工大学 | Image defogging method based on self-adaptive feature fusion |
CN114627002B (en) * | 2022-02-07 | 2024-09-27 | 华南理工大学 | Image defogging method based on self-adaptive feature fusion |
CN114494728A (en) * | 2022-02-10 | 2022-05-13 | 北京工业大学 | Small target detection method based on deep learning |
CN114494728B (en) * | 2022-02-10 | 2024-06-07 | 北京工业大学 | Small target detection method based on deep learning |
CN114529794A (en) * | 2022-04-20 | 2022-05-24 | 湖南大学 | Infrared and visible light image fusion method, system and medium |
CN114529794B (en) * | 2022-04-20 | 2022-07-08 | 湖南大学 | Infrared and visible light image fusion method, system and medium |
CN115100235A (en) * | 2022-08-18 | 2022-09-23 | 哈尔滨工业大学(深圳)(哈尔滨工业大学深圳科技创新研究院) | Target tracking method, system and storage medium |
CN115451939A (en) * | 2022-08-19 | 2022-12-09 | 中国人民解放军国防科技大学 | Parallel SLAM method based on detection segmentation in dynamic scene |
CN115451939B (en) * | 2022-08-19 | 2024-05-07 | 中国人民解放军国防科技大学 | Parallel SLAM method under dynamic scene based on detection segmentation |
CN115984672A (en) * | 2023-03-17 | 2023-04-18 | 成都纵横自动化技术股份有限公司 | Method and device for detecting small target in high-definition image based on deep learning |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112801158A (en) | Deep learning small target detection method and device based on cascade fusion and attention mechanism | |
CN109086668B (en) | Unmanned aerial vehicle remote sensing image road information extraction method based on multi-scale generation countermeasure network | |
CN113468967B (en) | Attention mechanism-based lane line detection method, attention mechanism-based lane line detection device, attention mechanism-based lane line detection equipment and attention mechanism-based lane line detection medium | |
CN109255317B (en) | Aerial image difference detection method based on double networks | |
CN110781756A (en) | Urban road extraction method and device based on remote sensing image | |
CN107918776B (en) | Land planning method and system based on machine vision and electronic equipment | |
EP3438929B1 (en) | Foreground and background detection method | |
CN113537070B (en) | Detection method, detection device, electronic equipment and storage medium | |
CN113743163A (en) | Traffic target recognition model training method, traffic target positioning method and device | |
CN113012215A (en) | Method, system and equipment for space positioning | |
CN114648709A (en) | Method and equipment for determining image difference information | |
CN113887472A (en) | Remote sensing image cloud detection method based on cascade color and texture feature attention | |
CN111753610A (en) | Weather identification method and device | |
CN116258940A (en) | Small target detection method for multi-scale features and self-adaptive weights | |
Malav et al. | DHSGAN: An end to end dehazing network for fog and smoke | |
CN112926426A (en) | Ship identification method, system, equipment and storage medium based on monitoring video | |
CN115546569A (en) | Attention mechanism-based data classification optimization method and related equipment | |
CN117893990B (en) | Road sign detection method, device and computer equipment | |
CN116740516A (en) | Target detection method and system based on multi-scale fusion feature extraction | |
CN111881984A (en) | Target detection method and device based on deep learning | |
CN115100469A (en) | Target attribute identification method, training method and device based on segmentation algorithm | |
CN113505653B (en) | Object detection method, device, apparatus, medium and program product | |
CN112699711A (en) | Lane line detection method, lane line detection device, storage medium, and electronic apparatus | |
Wu et al. | Research on asphalt pavement disease detection based on improved YOLOv5s | |
CN112329550A (en) | Weak supervision learning-based disaster-stricken building rapid positioning evaluation method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20210514 |
|
RJ01 | Rejection of invention patent application after publication |