CN111563513A - Defocus blur detection method based on attention mechanism - Google Patents

Defocus blur detection method based on attention mechanism Download PDF

Info

Publication number
CN111563513A
CN111563513A CN202010411177.6A CN202010411177A CN111563513A CN 111563513 A CN111563513 A CN 111563513A CN 202010411177 A CN202010411177 A CN 202010411177A CN 111563513 A CN111563513 A CN 111563513A
Authority
CN
China
Prior art keywords
order
feature map
attention
feature
low
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010411177.6A
Other languages
Chinese (zh)
Other versions
CN111563513B (en
Inventor
朱策
姜泽宇
刘翼鹏
刘晓宁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Electronic Science and Technology of China
Original Assignee
University of Electronic Science and Technology of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Electronic Science and Technology of China filed Critical University of Electronic Science and Technology of China
Priority to CN202010411177.6A priority Critical patent/CN111563513B/en
Publication of CN111563513A publication Critical patent/CN111563513A/en
Application granted granted Critical
Publication of CN111563513B publication Critical patent/CN111563513B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • G06V10/267Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The invention belongs to the technical field of image processing, and particularly relates to a defocus blur detection method based on an attention mechanism. The network structure of the invention uses a channel attention mechanism, extracts the relation between characteristic layers from the global angle, effectively improves the expression capability of the characteristics, simultaneously applies a space attention mechanism, and realizes the selective extraction of low-order information by combining high-order semantic information. The invention solves two important problems in improving fuzzy detection, namely, correctly classifying smooth clear areas and effectively inhibiting the influence caused by disordered backgrounds.

Description

Defocus blur detection method based on attention mechanism
Technical Field
The invention belongs to the technical field of image processing, and particularly relates to a defocus blur detection method based on an attention mechanism.
Background
Defocus blur detection is one of the basic image processing tasks, and its purpose is to segment out blurred parts of the picture. Blur detection has many applications, such as image deblurring, blur enhancement, depth estimation, etc. Among the most advanced defocus blur detection methods at present, the Convolutional Neural Network (CNN) is the most common and effective method for solving the problem. Compared with the traditional fuzzy detection method based on manual features, the convolutional neural network can effectively extract deep semantic information, and further can greatly improve the detection result. Deep semantic information can effectively locate the fuzzy area, and low-order features can be used for determining edge information of the detection area. The existing neural network fuzzy detection method fuses multi-level characteristics by constructing a larger and deeper neural network, so that the network obtains better characteristic expression. For example, Zhao et al propose a bottom-up full convolutional network that fuses low-level cables and high-level semantic information for defocus blur detection. Tang et al propose that defusing Net recurrently fuses and improves multi-level feature maps, and then combines these multi-level feature maps to obtain the final detection result.
Although the current defocus fuzzy detection method based on the deep learning neural network model can extract deep semantic information and further enhance the accuracy of a detection result, the current network model cannot completely utilize the feature representation capability of a Convolutional Neural Network (CNN), the defocus fuzzy detection method is not good in the main problems of two defocus fuzzy detections, firstly, smooth clear areas can be wrongly classified into fuzzy blocks, and secondly, because background clutter affects the detection result, the networks are not correctly classified in low contrast clear areas and defocus areas with strong background noise. Some defocus blur detection methods increase the feature representation capability of the network by constructing larger and deeper network structures, but the relationship between intermediate feature layers cannot be effectively extracted in the network structures, so that the discrimination capability of the convolutional neural network is influenced. Besides, the methods are all superimposed without distinction for all low-order feature information, however, the effect of the low-order information on the detection result is not the same, and some structural features of background information can cause the detection performance to be reduced and even some areas can be classified wrongly
Disclosure of Invention
The invention aims to solve the problems and provides a novel neural network structure which can enable the network to obtain better distinguishing capability and suppression capability of background noise through an attention mechanism.
The technical scheme adopted by the invention is that as shown in figure 1, the defocus blur detection method based on the attention mechanism specifically comprises the following steps:
step 1: sending the input picture into a pre-trained VGG-16 network, and extracting a multi-level feature map;
step 2: dividing the multi-level feature graphs into two types, wherein one type is used as a high-level feature, and the other type is used as a low-level feature;
and step 3: respectively sending the high-order characteristic diagram and the low-order characteristic diagram into a channel attention mechanism so as to enhance the characteristic expression of the network and obtain better distinguishing and learning capacity;
and 4, step 4: the feature map of the higher order is subjected to an upsampling operation (upsample) to change the size to the same size as the feature map of the lower order. Then, a space attention mechanism is used, according to high-order semantic information, detail features at different positions in a low-order feature map are weighted, effective detail information is given larger weight, and influence of background disorder is inhibited;
and 5: fusing the characteristic graphs of different levels together by high-order information and low-order information through a cross-channel connection (termination); )
Step 6: and further fusing the characteristics through a convolution layer, and obtaining a final detection result after passing through a Sigmoid function.
The method has the beneficial effects that the method mainly aims at improving two important problems in fuzzy detection, namely, the correct classification of smooth and clear areas and how to effectively inhibit the influence caused by disordered backgrounds. For the former problem, whether the region is defocused or not can not be correctly classified is mainly that the network discrimination capability is not strong enough, and in order to improve the situation, a channel attention mechanism is used in the network structure, and the connection among feature layers is extracted from the global angle, so that the feature expression capability is effectively improved. For the latter problem, the existing method is indiscriminately used for all low-order information. However, different detail information has different effects on the detection result, and only the detail information of the edge of the clear and fuzzy area has the maximum effect on the detection result. Some strongly cluttered background low-order information may even cause ambiguous areas to be erroneously determined as unambiguous areas. Therefore, the invention applies a spatial attention mechanism and realizes the selective extraction of low-order information by combining high-order semantic information.
Drawings
FIG. 1 is a schematic flow diagram of the present invention;
FIG. 2 is a schematic illustration of an attention mechanism process flow;
FIG. 3 is a schematic diagram of a spatial attention mechanism fused with low-level features;
FIG. 4 is a schematic diagram showing the comparison of the detection results of the detection method of the present invention with those of other detection methods;
FIG. 5 is a graph showing the results of a comparison of two evaluation criteria for MAE and F-measure on two published data sets DUT and Shi.
Detailed Description
The technical scheme of the invention is further described in detail by combining the accompanying drawings:
in step 2 of the present invention: firstly, an input picture is sent to VGG-16 which is pre-trained on ImageNet, and an initial high-order and low-order feature map is obtained. Specifically, the convolutional layers of VGG-16 are first divided into two categories, wherein conv1_2, conv2_2 are used as shallow networks to extract low-order information in the image; the conv3_3, conv4_3 and conv5_3 are used as deep networks to extract high-order semantic information. Then, using an upsample (upsample) operation on the high-order and low-order feature maps, respectively, con2_2 is changed to the size of conv1_2, and conv4_3, conv5_3 are changed to the size consistent with conv3_ 3. This results in initial high-order features as well as low-order features. And then, respectively sending the high-order information and the low-order information to a channel attention mechanism to extract dependency relationship information among the features.
As shown in FIG. 2, in step 3, the feature map for the input
Figure BDA0002493310050000031
(where C represents the number of channels, H represents the length of the feature map, and W represents the width of the feature map), this feature map is first transformedIs shaped as
Figure BDA0002493310050000032
Then for X1And its device
Figure BDA0002493310050000033
Making a matrix multiplication, and then using the softmax layer for the result of the multiplication to obtain an attention profile R, as follows, where RjiRepresents the value of the ith column element in the jth row in the attention feature map R. I.e. the impact factor of the ith channel on the jth channel. )
Figure BDA0002493310050000034
The more similar the two profiles, the stronger this connection. Then transpose X for the input feature mapTPerforming matrix multiplication with the characteristic diagram R to obtain the size of
Figure BDA0002493310050000035
Finally, multiplying the output of the channel attention by a scaling coefficient α, and superposing the result on the original characteristic diagram in a residual error connection mode to obtain the final output Y
Figure BDA0002493310050000036
The scale factor α in the formula is initialized to 0 at the beginning of training, and then a suitable value is gradually obtained through training process, and as can be seen from the above formula, the final output result of the module is the weighted sum of all feature maps and the feature map with the original input superimposed. Similar profiles can gain from each other, highlight regions of common interest and reduce variance. By considering the interrelationship between feature layers from a global perspective, the network can achieve greater resolution.
As shown in fig. 3, in step 4,
Figure BDA0002493310050000041
which represents a feature of a low order,
Figure BDA0002493310050000042
in order to extract global information and increase the receptive field, but not to increase too much calculation cost, the invention selects hole convolution (Atrous convolution) with two continuous convolution kernels of 3 × 3 and expansion rate of 5]The values within the interval serve as a feature map of spatial attention. The low-level feature map output by the final module is the result of multiplying the corresponding elements of the input low-level feature map by the feature map of spatial attention. By such a structure, the network can realize explicit selective extraction of low-order detail information. Lower order information that is more effective for the detection result is given more weight, while interference information from the background is effectively suppressed.
In step 5, the high-order feature map is subjected to an upsampling operation (upsample) to change the size to the same size as the low-order feature map. For defocused blur detection, high-order features can better position a blur block, but detail information is lacked on irregular boundaries, and low-order features can be used for optimizing detected boundaries, but semantic information is lacked, so that features of different levels need to be fused to obtain better complementary information, and further, a detection result is optimized. Specifically, the high-order information and the low-order information are merged together by a cross-channel connection (termination) to obtain different levels of feature maps.
The results of the detection generated by the method of the present invention are compared with those of other most advanced defocus blur detection methods in fig. 4. It can be seen that the method of the present invention can accurately distinguish smooth clear areas and can effectively suppress the interference from background noise.
The table in FIG. 5 is a comparison of the two evaluation criteria for MAE (smaller is better) and F-measure (larger is better) on the two published data sets DUT and Shi. It can be seen that the defocus blur detection method provided by the invention achieves the best performance in multiple aspects, and the effectiveness of the method is proved. )

Claims (1)

1. The defocus blur detection method based on the attention mechanism is characterized by comprising the following steps of:
s1, inputting the picture into a pre-trained VGG-16 network, and extracting a multi-level feature map;
s2, dividing the multi-level feature graphs into two types, wherein one type is used as a high-level feature graph, and the other type is used as a low-level feature graph; the method specifically comprises the following steps: dividing convolutional layers of the VGG-16 network into two types, taking conv1_2 and conv2_2 as shallow networks to extract low-order information in the image, namely defining feature maps extracted by conv1_2 and conv2_2 as low-order feature maps; the conv3_3, conv4_3 and conv5_3 serve as deep network extraction image high-order information, namely feature maps extracted by the conv3_3, conv4_3 and conv5_3 are defined as high-order feature maps; then, using up-sampling operation on the low-order and high-order feature maps respectively, changing the feature map extracted by con2_2 into the same size as the feature map extracted by conv1_2, and changing the feature maps extracted by conv4_3 and conv5_3 into the same size as the feature map extracted by conv3_3, thereby obtaining an initial low-order feature map and a high-order feature map;
s3, respectively enabling the low-order characteristic diagram and the high-order characteristic diagram to pass through a channel attention mechanism to obtain a low-order attention characteristic diagram and a high-order attention characteristic diagram; the processing method of the channel attention mechanism comprises the following steps: for input feature maps
Figure FDA0002493310040000011
Wherein C represents the number of channels, H represents the length of the feature map, and W represents the width of the feature map, and the feature map is firstly deformed into
Figure FDA0002493310040000012
Then for x1And its device
Figure FDA0002493310040000013
Matrix multiplication is carried out, finally, a softmax layer is used for the multiplication result to obtain an attention characteristic graph R,
Figure FDA0002493310040000014
wherein r isjiRepresenting the value of the ith element in the jth row and the ith column in the attention feature map R, namely the influence factor of the ith channel on the jth channel;
transpose of input feature graph XTPerforming matrix multiplication with the characteristic diagram R to obtain the size of
Figure FDA0002493310040000015
An output of (d);
the final output Y is obtained by multiplying the output of the channel attention by a proportionality coefficient alpha and superposing the product on the original characteristic diagram in a residual error connection mode:
Figure FDA0002493310040000016
the proportionality coefficient alpha is initialized to 0 when training is started, and then is updated through the training process;
defining a feature map obtained after the low-order feature map passes through a channel attention mechanism as a low-order attention feature map, and defining a feature map obtained after the high-order feature map passes through the channel attention mechanism as a high-order attention feature map;
s4, the size of the obtained high-order attention feature graph is changed to be the same as that of the low-order attention feature graph through an upsampling operation, then the obtained high-order attention feature graph is subjected to hollow convolution with two continuous convolution kernels of 3 x 3 and an expansion rate of 5, and an output value is mapped into a [0,1] interval by using a Sigmoid function to obtain a space attention feature graph;
s5, fusing the space attention feature map and the low-order attention feature map through cross-channel connection to the obtained space attention feature map, and obtaining a fused low-order feature map;
and S6, further fusing the fused low-order feature map and the spatial attention feature map through a convolution layer, and obtaining a final detection result through a Sigmoid function.
CN202010411177.6A 2020-05-15 2020-05-15 Defocus blur detection method based on attention mechanism Active CN111563513B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010411177.6A CN111563513B (en) 2020-05-15 2020-05-15 Defocus blur detection method based on attention mechanism

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010411177.6A CN111563513B (en) 2020-05-15 2020-05-15 Defocus blur detection method based on attention mechanism

Publications (2)

Publication Number Publication Date
CN111563513A true CN111563513A (en) 2020-08-21
CN111563513B CN111563513B (en) 2022-06-24

Family

ID=72072132

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010411177.6A Active CN111563513B (en) 2020-05-15 2020-05-15 Defocus blur detection method based on attention mechanism

Country Status (1)

Country Link
CN (1) CN111563513B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112137613A (en) * 2020-09-01 2020-12-29 沈阳东软智能医疗科技研究院有限公司 Method and device for determining abnormal position, storage medium and electronic equipment
CN113298154A (en) * 2021-05-27 2021-08-24 安徽大学 RGB-D image salient target detection method

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103996198A (en) * 2014-06-04 2014-08-20 天津工业大学 Method for detecting region of interest in complicated natural environment
KR101649185B1 (en) * 2015-02-27 2016-08-18 서울대학교 산학협력단 Method and apparatus for calculating visual attention score
US20170124432A1 (en) * 2015-11-03 2017-05-04 Baidu Usa Llc Systems and methods for attention-based configurable convolutional neural networks (abc-cnn) for visual question answering
CN109872306A (en) * 2019-01-28 2019-06-11 腾讯科技(深圳)有限公司 Medical image cutting method, device and storage medium
CN110084210A (en) * 2019-04-30 2019-08-02 电子科技大学 The multiple dimensioned Ship Detection of SAR image based on attention pyramid network
CN110287960A (en) * 2019-07-02 2019-09-27 中国科学院信息工程研究所 The detection recognition method of curve text in natural scene image
CN110490189A (en) * 2019-07-04 2019-11-22 上海海事大学 A kind of detection method of the conspicuousness object based on two-way news link convolutional network
US20190362199A1 (en) * 2018-05-25 2019-11-28 Adobe Inc. Joint blur map estimation and blur desirability classification from an image
CN110648334A (en) * 2019-09-18 2020-01-03 中国人民解放军火箭军工程大学 Multi-feature cyclic convolution saliency target detection method based on attention mechanism
CN111079584A (en) * 2019-12-03 2020-04-28 东华大学 Rapid vehicle detection method based on improved YOLOv3

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103996198A (en) * 2014-06-04 2014-08-20 天津工业大学 Method for detecting region of interest in complicated natural environment
KR101649185B1 (en) * 2015-02-27 2016-08-18 서울대학교 산학협력단 Method and apparatus for calculating visual attention score
US20170124432A1 (en) * 2015-11-03 2017-05-04 Baidu Usa Llc Systems and methods for attention-based configurable convolutional neural networks (abc-cnn) for visual question answering
US20190362199A1 (en) * 2018-05-25 2019-11-28 Adobe Inc. Joint blur map estimation and blur desirability classification from an image
CN109872306A (en) * 2019-01-28 2019-06-11 腾讯科技(深圳)有限公司 Medical image cutting method, device and storage medium
CN110084210A (en) * 2019-04-30 2019-08-02 电子科技大学 The multiple dimensioned Ship Detection of SAR image based on attention pyramid network
CN110287960A (en) * 2019-07-02 2019-09-27 中国科学院信息工程研究所 The detection recognition method of curve text in natural scene image
CN110490189A (en) * 2019-07-04 2019-11-22 上海海事大学 A kind of detection method of the conspicuousness object based on two-way news link convolutional network
CN110648334A (en) * 2019-09-18 2020-01-03 中国人民解放军火箭军工程大学 Multi-feature cyclic convolution saliency target detection method based on attention mechanism
CN111079584A (en) * 2019-12-03 2020-04-28 东华大学 Rapid vehicle detection method based on improved YOLOv3

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
CHANG TANG: "BR2NET:defocus blur detection via a bidirectional channel attention residual refining network", 《IEEE TRANSACTIONS ON MULTIMEDIA》 *
XUEWEIWANG: "accurate and fast blur detection using a pyramid M-shaped deep neural network", 《IEEE ACCESS》 *
周双双等: "基于增强语义与多注意力机制学习的深度相关跟踪", 《计算机工程》 *
麻森权等: "基于注意力机制和特征融合改进的小目标检测算法", 《计算机应用与软件》 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112137613A (en) * 2020-09-01 2020-12-29 沈阳东软智能医疗科技研究院有限公司 Method and device for determining abnormal position, storage medium and electronic equipment
CN112137613B (en) * 2020-09-01 2024-02-02 沈阳东软智能医疗科技研究院有限公司 Determination method and device of abnormal position, storage medium and electronic equipment
CN113298154A (en) * 2021-05-27 2021-08-24 安徽大学 RGB-D image salient target detection method
CN113298154B (en) * 2021-05-27 2022-11-11 安徽大学 RGB-D image salient object detection method

Also Published As

Publication number Publication date
CN111563513B (en) 2022-06-24

Similar Documents

Publication Publication Date Title
CN111462126B (en) Semantic image segmentation method and system based on edge enhancement
CN107564025B (en) Electric power equipment infrared image semantic segmentation method based on deep neural network
CN111047551B (en) Remote sensing image change detection method and system based on U-net improved algorithm
CN111354017B (en) Target tracking method based on twin neural network and parallel attention module
CN110335290B (en) Twin candidate region generation network target tracking method based on attention mechanism
CN108509978B (en) Multi-class target detection method and model based on CNN (CNN) multi-level feature fusion
CN109711316B (en) Pedestrian re-identification method, device, equipment and storage medium
CN110232394B (en) Multi-scale image semantic segmentation method
CN108830280B (en) Small target detection method based on regional nomination
CN110048827B (en) Class template attack method based on deep learning convolutional neural network
CN111222562B (en) Target detection method based on space self-attention mechanism
CN112489054A (en) Remote sensing image semantic segmentation method based on deep learning
CN112966691A (en) Multi-scale text detection method and device based on semantic segmentation and electronic equipment
CN111310582A (en) Turbulence degradation image semantic segmentation method based on boundary perception and counterstudy
CN111563513B (en) Defocus blur detection method based on attention mechanism
CN112365514A (en) Semantic segmentation method based on improved PSPNet
CN113011329A (en) Pyramid network based on multi-scale features and dense crowd counting method
CN111797841B (en) Visual saliency detection method based on depth residual error network
CN113657491A (en) Neural network design method for signal modulation type recognition
CN114332133A (en) New coronary pneumonia CT image infected area segmentation method and system based on improved CE-Net
CN112149526A (en) Lane line detection method and system based on long-distance information fusion
CN114663665A (en) Gradient-based confrontation sample generation method and system
CN113743422B (en) Crowd density estimation method, device and storage medium for multi-feature information fusion
CN112926667B (en) Method and device for detecting saliency target of depth fusion edge and high-level feature
CN112329793B (en) Significance detection method based on structure self-adaption and scale self-adaption receptive fields

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant