CN111563513A - Defocus blur detection method based on attention mechanism - Google Patents
Defocus blur detection method based on attention mechanism Download PDFInfo
- Publication number
- CN111563513A CN111563513A CN202010411177.6A CN202010411177A CN111563513A CN 111563513 A CN111563513 A CN 111563513A CN 202010411177 A CN202010411177 A CN 202010411177A CN 111563513 A CN111563513 A CN 111563513A
- Authority
- CN
- China
- Prior art keywords
- order
- feature map
- attention
- feature
- low
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/26—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
- G06V10/267—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
- G06F18/253—Fusion techniques of extracted features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/048—Activation functions
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- Computing Systems (AREA)
- Software Systems (AREA)
- Molecular Biology (AREA)
- Computational Linguistics (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Mathematical Physics (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Multimedia (AREA)
- Image Analysis (AREA)
Abstract
The invention belongs to the technical field of image processing, and particularly relates to a defocus blur detection method based on an attention mechanism. The network structure of the invention uses a channel attention mechanism, extracts the relation between characteristic layers from the global angle, effectively improves the expression capability of the characteristics, simultaneously applies a space attention mechanism, and realizes the selective extraction of low-order information by combining high-order semantic information. The invention solves two important problems in improving fuzzy detection, namely, correctly classifying smooth clear areas and effectively inhibiting the influence caused by disordered backgrounds.
Description
Technical Field
The invention belongs to the technical field of image processing, and particularly relates to a defocus blur detection method based on an attention mechanism.
Background
Defocus blur detection is one of the basic image processing tasks, and its purpose is to segment out blurred parts of the picture. Blur detection has many applications, such as image deblurring, blur enhancement, depth estimation, etc. Among the most advanced defocus blur detection methods at present, the Convolutional Neural Network (CNN) is the most common and effective method for solving the problem. Compared with the traditional fuzzy detection method based on manual features, the convolutional neural network can effectively extract deep semantic information, and further can greatly improve the detection result. Deep semantic information can effectively locate the fuzzy area, and low-order features can be used for determining edge information of the detection area. The existing neural network fuzzy detection method fuses multi-level characteristics by constructing a larger and deeper neural network, so that the network obtains better characteristic expression. For example, Zhao et al propose a bottom-up full convolutional network that fuses low-level cables and high-level semantic information for defocus blur detection. Tang et al propose that defusing Net recurrently fuses and improves multi-level feature maps, and then combines these multi-level feature maps to obtain the final detection result.
Although the current defocus fuzzy detection method based on the deep learning neural network model can extract deep semantic information and further enhance the accuracy of a detection result, the current network model cannot completely utilize the feature representation capability of a Convolutional Neural Network (CNN), the defocus fuzzy detection method is not good in the main problems of two defocus fuzzy detections, firstly, smooth clear areas can be wrongly classified into fuzzy blocks, and secondly, because background clutter affects the detection result, the networks are not correctly classified in low contrast clear areas and defocus areas with strong background noise. Some defocus blur detection methods increase the feature representation capability of the network by constructing larger and deeper network structures, but the relationship between intermediate feature layers cannot be effectively extracted in the network structures, so that the discrimination capability of the convolutional neural network is influenced. Besides, the methods are all superimposed without distinction for all low-order feature information, however, the effect of the low-order information on the detection result is not the same, and some structural features of background information can cause the detection performance to be reduced and even some areas can be classified wrongly
Disclosure of Invention
The invention aims to solve the problems and provides a novel neural network structure which can enable the network to obtain better distinguishing capability and suppression capability of background noise through an attention mechanism.
The technical scheme adopted by the invention is that as shown in figure 1, the defocus blur detection method based on the attention mechanism specifically comprises the following steps:
step 1: sending the input picture into a pre-trained VGG-16 network, and extracting a multi-level feature map;
step 2: dividing the multi-level feature graphs into two types, wherein one type is used as a high-level feature, and the other type is used as a low-level feature;
and step 3: respectively sending the high-order characteristic diagram and the low-order characteristic diagram into a channel attention mechanism so as to enhance the characteristic expression of the network and obtain better distinguishing and learning capacity;
and 4, step 4: the feature map of the higher order is subjected to an upsampling operation (upsample) to change the size to the same size as the feature map of the lower order. Then, a space attention mechanism is used, according to high-order semantic information, detail features at different positions in a low-order feature map are weighted, effective detail information is given larger weight, and influence of background disorder is inhibited;
and 5: fusing the characteristic graphs of different levels together by high-order information and low-order information through a cross-channel connection (termination); )
Step 6: and further fusing the characteristics through a convolution layer, and obtaining a final detection result after passing through a Sigmoid function.
The method has the beneficial effects that the method mainly aims at improving two important problems in fuzzy detection, namely, the correct classification of smooth and clear areas and how to effectively inhibit the influence caused by disordered backgrounds. For the former problem, whether the region is defocused or not can not be correctly classified is mainly that the network discrimination capability is not strong enough, and in order to improve the situation, a channel attention mechanism is used in the network structure, and the connection among feature layers is extracted from the global angle, so that the feature expression capability is effectively improved. For the latter problem, the existing method is indiscriminately used for all low-order information. However, different detail information has different effects on the detection result, and only the detail information of the edge of the clear and fuzzy area has the maximum effect on the detection result. Some strongly cluttered background low-order information may even cause ambiguous areas to be erroneously determined as unambiguous areas. Therefore, the invention applies a spatial attention mechanism and realizes the selective extraction of low-order information by combining high-order semantic information.
Drawings
FIG. 1 is a schematic flow diagram of the present invention;
FIG. 2 is a schematic illustration of an attention mechanism process flow;
FIG. 3 is a schematic diagram of a spatial attention mechanism fused with low-level features;
FIG. 4 is a schematic diagram showing the comparison of the detection results of the detection method of the present invention with those of other detection methods;
FIG. 5 is a graph showing the results of a comparison of two evaluation criteria for MAE and F-measure on two published data sets DUT and Shi.
Detailed Description
The technical scheme of the invention is further described in detail by combining the accompanying drawings:
in step 2 of the present invention: firstly, an input picture is sent to VGG-16 which is pre-trained on ImageNet, and an initial high-order and low-order feature map is obtained. Specifically, the convolutional layers of VGG-16 are first divided into two categories, wherein conv1_2, conv2_2 are used as shallow networks to extract low-order information in the image; the conv3_3, conv4_3 and conv5_3 are used as deep networks to extract high-order semantic information. Then, using an upsample (upsample) operation on the high-order and low-order feature maps, respectively, con2_2 is changed to the size of conv1_2, and conv4_3, conv5_3 are changed to the size consistent with conv3_ 3. This results in initial high-order features as well as low-order features. And then, respectively sending the high-order information and the low-order information to a channel attention mechanism to extract dependency relationship information among the features.
As shown in FIG. 2, in step 3, the feature map for the input(where C represents the number of channels, H represents the length of the feature map, and W represents the width of the feature map), this feature map is first transformedIs shaped asThen for X1And its deviceMaking a matrix multiplication, and then using the softmax layer for the result of the multiplication to obtain an attention profile R, as follows, where RjiRepresents the value of the ith column element in the jth row in the attention feature map R. I.e. the impact factor of the ith channel on the jth channel. )
The more similar the two profiles, the stronger this connection. Then transpose X for the input feature mapTPerforming matrix multiplication with the characteristic diagram R to obtain the size ofFinally, multiplying the output of the channel attention by a scaling coefficient α, and superposing the result on the original characteristic diagram in a residual error connection mode to obtain the final output Y
The scale factor α in the formula is initialized to 0 at the beginning of training, and then a suitable value is gradually obtained through training process, and as can be seen from the above formula, the final output result of the module is the weighted sum of all feature maps and the feature map with the original input superimposed. Similar profiles can gain from each other, highlight regions of common interest and reduce variance. By considering the interrelationship between feature layers from a global perspective, the network can achieve greater resolution.
As shown in fig. 3, in step 4,which represents a feature of a low order,in order to extract global information and increase the receptive field, but not to increase too much calculation cost, the invention selects hole convolution (Atrous convolution) with two continuous convolution kernels of 3 × 3 and expansion rate of 5]The values within the interval serve as a feature map of spatial attention. The low-level feature map output by the final module is the result of multiplying the corresponding elements of the input low-level feature map by the feature map of spatial attention. By such a structure, the network can realize explicit selective extraction of low-order detail information. Lower order information that is more effective for the detection result is given more weight, while interference information from the background is effectively suppressed.
In step 5, the high-order feature map is subjected to an upsampling operation (upsample) to change the size to the same size as the low-order feature map. For defocused blur detection, high-order features can better position a blur block, but detail information is lacked on irregular boundaries, and low-order features can be used for optimizing detected boundaries, but semantic information is lacked, so that features of different levels need to be fused to obtain better complementary information, and further, a detection result is optimized. Specifically, the high-order information and the low-order information are merged together by a cross-channel connection (termination) to obtain different levels of feature maps.
The results of the detection generated by the method of the present invention are compared with those of other most advanced defocus blur detection methods in fig. 4. It can be seen that the method of the present invention can accurately distinguish smooth clear areas and can effectively suppress the interference from background noise.
The table in FIG. 5 is a comparison of the two evaluation criteria for MAE (smaller is better) and F-measure (larger is better) on the two published data sets DUT and Shi. It can be seen that the defocus blur detection method provided by the invention achieves the best performance in multiple aspects, and the effectiveness of the method is proved. )
Claims (1)
1. The defocus blur detection method based on the attention mechanism is characterized by comprising the following steps of:
s1, inputting the picture into a pre-trained VGG-16 network, and extracting a multi-level feature map;
s2, dividing the multi-level feature graphs into two types, wherein one type is used as a high-level feature graph, and the other type is used as a low-level feature graph; the method specifically comprises the following steps: dividing convolutional layers of the VGG-16 network into two types, taking conv1_2 and conv2_2 as shallow networks to extract low-order information in the image, namely defining feature maps extracted by conv1_2 and conv2_2 as low-order feature maps; the conv3_3, conv4_3 and conv5_3 serve as deep network extraction image high-order information, namely feature maps extracted by the conv3_3, conv4_3 and conv5_3 are defined as high-order feature maps; then, using up-sampling operation on the low-order and high-order feature maps respectively, changing the feature map extracted by con2_2 into the same size as the feature map extracted by conv1_2, and changing the feature maps extracted by conv4_3 and conv5_3 into the same size as the feature map extracted by conv3_3, thereby obtaining an initial low-order feature map and a high-order feature map;
s3, respectively enabling the low-order characteristic diagram and the high-order characteristic diagram to pass through a channel attention mechanism to obtain a low-order attention characteristic diagram and a high-order attention characteristic diagram; the processing method of the channel attention mechanism comprises the following steps: for input feature mapsWherein C represents the number of channels, H represents the length of the feature map, and W represents the width of the feature map, and the feature map is firstly deformed intoThen for x1And its deviceMatrix multiplication is carried out, finally, a softmax layer is used for the multiplication result to obtain an attention characteristic graph R,
wherein r isjiRepresenting the value of the ith element in the jth row and the ith column in the attention feature map R, namely the influence factor of the ith channel on the jth channel;
transpose of input feature graph XTPerforming matrix multiplication with the characteristic diagram R to obtain the size ofAn output of (d);
the final output Y is obtained by multiplying the output of the channel attention by a proportionality coefficient alpha and superposing the product on the original characteristic diagram in a residual error connection mode:
the proportionality coefficient alpha is initialized to 0 when training is started, and then is updated through the training process;
defining a feature map obtained after the low-order feature map passes through a channel attention mechanism as a low-order attention feature map, and defining a feature map obtained after the high-order feature map passes through the channel attention mechanism as a high-order attention feature map;
s4, the size of the obtained high-order attention feature graph is changed to be the same as that of the low-order attention feature graph through an upsampling operation, then the obtained high-order attention feature graph is subjected to hollow convolution with two continuous convolution kernels of 3 x 3 and an expansion rate of 5, and an output value is mapped into a [0,1] interval by using a Sigmoid function to obtain a space attention feature graph;
s5, fusing the space attention feature map and the low-order attention feature map through cross-channel connection to the obtained space attention feature map, and obtaining a fused low-order feature map;
and S6, further fusing the fused low-order feature map and the spatial attention feature map through a convolution layer, and obtaining a final detection result through a Sigmoid function.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010411177.6A CN111563513B (en) | 2020-05-15 | 2020-05-15 | Defocus blur detection method based on attention mechanism |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010411177.6A CN111563513B (en) | 2020-05-15 | 2020-05-15 | Defocus blur detection method based on attention mechanism |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111563513A true CN111563513A (en) | 2020-08-21 |
CN111563513B CN111563513B (en) | 2022-06-24 |
Family
ID=72072132
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010411177.6A Active CN111563513B (en) | 2020-05-15 | 2020-05-15 | Defocus blur detection method based on attention mechanism |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111563513B (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112137613A (en) * | 2020-09-01 | 2020-12-29 | 沈阳东软智能医疗科技研究院有限公司 | Method and device for determining abnormal position, storage medium and electronic equipment |
CN113298154A (en) * | 2021-05-27 | 2021-08-24 | 安徽大学 | RGB-D image salient target detection method |
Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103996198A (en) * | 2014-06-04 | 2014-08-20 | 天津工业大学 | Method for detecting region of interest in complicated natural environment |
KR101649185B1 (en) * | 2015-02-27 | 2016-08-18 | 서울대학교 산학협력단 | Method and apparatus for calculating visual attention score |
US20170124432A1 (en) * | 2015-11-03 | 2017-05-04 | Baidu Usa Llc | Systems and methods for attention-based configurable convolutional neural networks (abc-cnn) for visual question answering |
CN109872306A (en) * | 2019-01-28 | 2019-06-11 | 腾讯科技(深圳)有限公司 | Medical image cutting method, device and storage medium |
CN110084210A (en) * | 2019-04-30 | 2019-08-02 | 电子科技大学 | The multiple dimensioned Ship Detection of SAR image based on attention pyramid network |
CN110287960A (en) * | 2019-07-02 | 2019-09-27 | 中国科学院信息工程研究所 | The detection recognition method of curve text in natural scene image |
CN110490189A (en) * | 2019-07-04 | 2019-11-22 | 上海海事大学 | A kind of detection method of the conspicuousness object based on two-way news link convolutional network |
US20190362199A1 (en) * | 2018-05-25 | 2019-11-28 | Adobe Inc. | Joint blur map estimation and blur desirability classification from an image |
CN110648334A (en) * | 2019-09-18 | 2020-01-03 | 中国人民解放军火箭军工程大学 | Multi-feature cyclic convolution saliency target detection method based on attention mechanism |
CN111079584A (en) * | 2019-12-03 | 2020-04-28 | 东华大学 | Rapid vehicle detection method based on improved YOLOv3 |
-
2020
- 2020-05-15 CN CN202010411177.6A patent/CN111563513B/en active Active
Patent Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103996198A (en) * | 2014-06-04 | 2014-08-20 | 天津工业大学 | Method for detecting region of interest in complicated natural environment |
KR101649185B1 (en) * | 2015-02-27 | 2016-08-18 | 서울대학교 산학협력단 | Method and apparatus for calculating visual attention score |
US20170124432A1 (en) * | 2015-11-03 | 2017-05-04 | Baidu Usa Llc | Systems and methods for attention-based configurable convolutional neural networks (abc-cnn) for visual question answering |
US20190362199A1 (en) * | 2018-05-25 | 2019-11-28 | Adobe Inc. | Joint blur map estimation and blur desirability classification from an image |
CN109872306A (en) * | 2019-01-28 | 2019-06-11 | 腾讯科技(深圳)有限公司 | Medical image cutting method, device and storage medium |
CN110084210A (en) * | 2019-04-30 | 2019-08-02 | 电子科技大学 | The multiple dimensioned Ship Detection of SAR image based on attention pyramid network |
CN110287960A (en) * | 2019-07-02 | 2019-09-27 | 中国科学院信息工程研究所 | The detection recognition method of curve text in natural scene image |
CN110490189A (en) * | 2019-07-04 | 2019-11-22 | 上海海事大学 | A kind of detection method of the conspicuousness object based on two-way news link convolutional network |
CN110648334A (en) * | 2019-09-18 | 2020-01-03 | 中国人民解放军火箭军工程大学 | Multi-feature cyclic convolution saliency target detection method based on attention mechanism |
CN111079584A (en) * | 2019-12-03 | 2020-04-28 | 东华大学 | Rapid vehicle detection method based on improved YOLOv3 |
Non-Patent Citations (4)
Title |
---|
CHANG TANG: "BR2NET:defocus blur detection via a bidirectional channel attention residual refining network", 《IEEE TRANSACTIONS ON MULTIMEDIA》 * |
XUEWEIWANG: "accurate and fast blur detection using a pyramid M-shaped deep neural network", 《IEEE ACCESS》 * |
周双双等: "基于增强语义与多注意力机制学习的深度相关跟踪", 《计算机工程》 * |
麻森权等: "基于注意力机制和特征融合改进的小目标检测算法", 《计算机应用与软件》 * |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112137613A (en) * | 2020-09-01 | 2020-12-29 | 沈阳东软智能医疗科技研究院有限公司 | Method and device for determining abnormal position, storage medium and electronic equipment |
CN112137613B (en) * | 2020-09-01 | 2024-02-02 | 沈阳东软智能医疗科技研究院有限公司 | Determination method and device of abnormal position, storage medium and electronic equipment |
CN113298154A (en) * | 2021-05-27 | 2021-08-24 | 安徽大学 | RGB-D image salient target detection method |
CN113298154B (en) * | 2021-05-27 | 2022-11-11 | 安徽大学 | RGB-D image salient object detection method |
Also Published As
Publication number | Publication date |
---|---|
CN111563513B (en) | 2022-06-24 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111462126B (en) | Semantic image segmentation method and system based on edge enhancement | |
CN107564025B (en) | Electric power equipment infrared image semantic segmentation method based on deep neural network | |
CN111047551B (en) | Remote sensing image change detection method and system based on U-net improved algorithm | |
CN111354017B (en) | Target tracking method based on twin neural network and parallel attention module | |
CN110335290B (en) | Twin candidate region generation network target tracking method based on attention mechanism | |
CN108509978B (en) | Multi-class target detection method and model based on CNN (CNN) multi-level feature fusion | |
CN109711316B (en) | Pedestrian re-identification method, device, equipment and storage medium | |
CN110232394B (en) | Multi-scale image semantic segmentation method | |
CN108830280B (en) | Small target detection method based on regional nomination | |
CN110048827B (en) | Class template attack method based on deep learning convolutional neural network | |
CN111222562B (en) | Target detection method based on space self-attention mechanism | |
CN112489054A (en) | Remote sensing image semantic segmentation method based on deep learning | |
CN112966691A (en) | Multi-scale text detection method and device based on semantic segmentation and electronic equipment | |
CN111310582A (en) | Turbulence degradation image semantic segmentation method based on boundary perception and counterstudy | |
CN111563513B (en) | Defocus blur detection method based on attention mechanism | |
CN112365514A (en) | Semantic segmentation method based on improved PSPNet | |
CN113011329A (en) | Pyramid network based on multi-scale features and dense crowd counting method | |
CN111797841B (en) | Visual saliency detection method based on depth residual error network | |
CN113657491A (en) | Neural network design method for signal modulation type recognition | |
CN114332133A (en) | New coronary pneumonia CT image infected area segmentation method and system based on improved CE-Net | |
CN112149526A (en) | Lane line detection method and system based on long-distance information fusion | |
CN114663665A (en) | Gradient-based confrontation sample generation method and system | |
CN113743422B (en) | Crowd density estimation method, device and storage medium for multi-feature information fusion | |
CN112926667B (en) | Method and device for detecting saliency target of depth fusion edge and high-level feature | |
CN112329793B (en) | Significance detection method based on structure self-adaption and scale self-adaption receptive fields |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |