CN110490813B - Feature map enhancement method, device, equipment and medium for convolutional neural network - Google Patents
Feature map enhancement method, device, equipment and medium for convolutional neural network Download PDFInfo
- Publication number
- CN110490813B CN110490813B CN201910605387.6A CN201910605387A CN110490813B CN 110490813 B CN110490813 B CN 110490813B CN 201910605387 A CN201910605387 A CN 201910605387A CN 110490813 B CN110490813 B CN 110490813B
- Authority
- CN
- China
- Prior art keywords
- sub
- feature map
- feature
- channel
- enhancement
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000013527 convolutional neural network Methods 0.000 title claims abstract description 31
- 238000000034 method Methods 0.000 title claims abstract description 31
- 239000013598 vector Substances 0.000 claims abstract description 36
- 238000011176 pooling Methods 0.000 claims abstract description 29
- 238000012545 processing Methods 0.000 claims abstract description 15
- 230000002708 enhancing effect Effects 0.000 claims abstract description 12
- 230000006870 function Effects 0.000 claims description 32
- 230000004913 activation Effects 0.000 claims description 25
- 238000010586 diagram Methods 0.000 claims description 24
- 239000003623 enhancer Substances 0.000 claims description 16
- 230000003213 activating effect Effects 0.000 claims description 14
- 238000004590 computer program Methods 0.000 claims description 14
- 230000009467 reduction Effects 0.000 claims description 8
- 238000001514 detection method Methods 0.000 abstract description 6
- 230000011218 segmentation Effects 0.000 abstract description 6
- 230000007246 mechanism Effects 0.000 description 8
- 230000004044 response Effects 0.000 description 6
- 238000004422 calculation algorithm Methods 0.000 description 4
- 238000004364 calculation method Methods 0.000 description 4
- 238000012935 Averaging Methods 0.000 description 2
- 238000013528 artificial neural network Methods 0.000 description 2
- 238000013135 deep learning Methods 0.000 description 2
- 230000004927 fusion Effects 0.000 description 2
- 230000003993 interaction Effects 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 230000008901 benefit Effects 0.000 description 1
- 230000015556 catabolic process Effects 0.000 description 1
- 238000006731 degradation reaction Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000007717 exclusion Effects 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 239000013307 optical fiber Substances 0.000 description 1
- 238000004091 panning Methods 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 238000000926 separation method Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10004—Still image; Photographic image
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Molecular Biology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Image Analysis (AREA)
Abstract
The application discloses a method, a device, equipment and a medium for enhancing a feature map of a convolutional neural network, which are used for carrying out convolution operation on an input original image to obtain a corresponding multilayer feature map, grouping the feature map according to channel dimensions to obtain a plurality of sub-feature maps, carrying out global average pooling and global maximum pooling parallel processing on each sub-feature map by utilizing an embedded airspace grouping enhancement SGE module to obtain two corresponding channel dimension vectors, obtaining an attention enhancement factor of each channel in the corresponding sub-feature map according to the two corresponding channel dimension vectors, obtaining a corresponding enhancement feature map according to the attention enhancement factor and the corresponding sub-feature map, obtaining an enhancement feature map corresponding to the feature map of a certain layer according to all the enhancement feature maps, thereby better expressing semantic information of the importance degree between sub-feature map channels, the task performances of image classification, segmentation, detection and the like of the convolutional neural network are improved.
Description
Technical Field
The present application relates to the field of computer vision technologies, and in particular, to a method, an apparatus, a device, and a medium for enhancing a feature map of a convolutional neural network.
Background
With the rise of deep learning, CNN (Convolutional Neural Network) is increasingly developed and applied in the field of computer vision as one of deep learning techniques, and researchers propose many convolution operations, such as transposed convolution, dilated convolution, grouped convolution, deep separation convolution, point-by-point convolution, deformable convolution, and the like. The grouping convolution has great advantages in reducing the amount of computation and parameter, preventing overfitting, and the like, and is consistent with the grouping ideas adopted by the artificial design features in the early computer vision field, such as HOG (Histogram of Oriented Gradient), SIFT (Scale-invariant feature transform), LBP (Local Binary Pattern), and the like, so that many classical networks, AlexNet, resext, MobileNet, ShuffleNet, capsulet, and the like all use the grouping ideas, and all group the feature maps along the channel dimension, and then perform convolution or regularization processing on each group of sub-feature maps, so that the semantic feature information of a specific region can be better expressed, and excellent performance improvement is achieved in the computer vision field.
At present, most CNN network structures improve the feature expression capability of a model by introducing an attention mechanism, and become very popular in the fields of computer vision and the like, and various neural network structures introduce a channel dimension or space dimension attention mechanism to enhance useful channel information and compress useless channel information, and can also fuse multi-scale features or global context information to further improve the enhancement capability of a specific region of a feature map, so that the neural network has a more interpretable mechanism. It can be seen that adding the grouped sub-feature maps into the attention mechanism can further enhance the ability to learn and express the semantic feature information of a specific region, and further compress noise and interference.
However, semantic feature information extracted by an SGE (Spatial Group-wise enhancement) module embedded in a conventional CNN network structure is not sufficiently expressed, which results in performance degradation of tasks such as image classification, segmentation, and detection of the CNN network.
Disclosure of Invention
The application aims to provide a feature map enhancement method, a feature map enhancement device, feature map enhancement equipment and feature map enhancement media for improving task performances of image classification, segmentation, detection and the like of a convolutional neural network.
In a first aspect, an embodiment of the present application provides a feature map enhancement method for a convolutional neural network, including:
performing convolution operation on an input original image to obtain a corresponding multilayer characteristic diagram;
grouping feature graphs of a certain layer according to channel dimensions to obtain a plurality of sub-feature graphs;
aiming at each sub-feature graph, performing global average pooling and global maximum pooling parallel processing by using an embedded airspace grouping enhancement SGE module to obtain two corresponding channel dimension vectors;
obtaining an attention enhancement factor of each channel in the corresponding sub-feature graph according to the corresponding two channel dimension vectors;
obtaining a corresponding enhancer characteristic map according to the attention enhancement factor and the corresponding sub characteristic map;
and obtaining an enhanced characteristic diagram corresponding to the characteristic diagram of the certain layer according to all the enhancer characteristic diagrams.
In a possible implementation manner, in the foregoing method provided in this embodiment of the present application, the obtaining, according to the two corresponding channel dimension vectors, an attention enhancement factor of each channel in a corresponding sub-feature map includes:
reducing the dimension of the corresponding two channel dimension vectors by using 1 x 1 convolution;
and activating and adding the two channel dimension vectors after dimension reduction by using a ReLU activation function to obtain the attention enhancement factor of each channel in the corresponding sub-feature map.
In one possible implementation manner, in the foregoing method provided in this embodiment of the present application, the obtaining a corresponding enhancer feature map according to the attention enhancement factor and the corresponding sub-feature map includes:
using 1 × 1 convolution to raise the attention enhancement factor to the number of channels corresponding to the sub-feature map;
normalizing the attention enhancement factor using a SoftMax function;
multiplying the normalized attention enhancement factor by the corresponding sub-feature map to obtain an enhanced first sub-feature map;
regularizing the first sub-feature graph to obtain a second sub-feature graph;
activating the second sub-feature graph by using a Sigmoid activation function to obtain an enhanced third sub-feature graph;
and taking the third sub-feature map as an enhancer feature map.
In a possible implementation manner, in the foregoing method provided in this embodiment of the present application, the activating the second sub-feature map by using a Sigmoid activation function to obtain an enhanced third sub-feature map includes:
obtaining an importance coefficient of the second sub-feature map channel by using a Sigmoid activation function;
and scaling the second sub-feature map by using the importance coefficient to recalibrate the importance of the spatial domain feature on the channel of the second sub-feature map to obtain an enhanced third sub-feature map.
In a second aspect, an embodiment of the present application provides a feature map enhancement apparatus for a convolutional neural network, including:
the convolution module is used for carrying out convolution operation on the input original image to obtain a corresponding multilayer characteristic diagram;
the grouping module is used for grouping the characteristic diagrams of a certain layer according to the channel dimension to obtain a plurality of sub-characteristic diagrams;
the enhancement module is used for carrying out global average pooling and global maximum pooling parallel processing by utilizing the embedded airspace grouping enhancement SGE module aiming at each sub-feature map to obtain two corresponding channel dimension vectors; obtaining an attention enhancement factor of each channel in the corresponding sub-feature graph according to the corresponding two channel dimension vectors; obtaining a corresponding enhancer characteristic map according to the attention enhancement factor and the corresponding sub characteristic map;
and the output module is used for obtaining an enhanced feature map corresponding to the feature map of the certain layer according to all the enhanced feature maps.
In a possible implementation manner, in the apparatus provided in this embodiment of the present application, the enhancing module is specifically configured to:
reducing the dimension of the corresponding two channel dimension vectors by using 1 x 1 convolution;
and activating and adding the two channel dimension vectors after dimension reduction by using a ReLU activation function to obtain the attention enhancement factor of each channel in the corresponding sub-feature map.
In a possible implementation manner, in the apparatus provided in this embodiment of the present application, the enhancing module is specifically configured to:
using 1 × 1 convolution to raise the attention enhancement factor to the number of channels corresponding to the sub-feature map;
normalizing the attention enhancement factor using a SoftMax function;
multiplying the normalized attention enhancement factor by the corresponding sub-feature map to obtain an enhanced first sub-feature map;
regularizing the first sub-feature graph to obtain a second sub-feature graph;
activating the second sub-feature graph by using a Sigmoid activation function to obtain an enhanced third sub-feature graph;
and taking the third sub-feature map as an enhancer feature map.
In a possible implementation manner, in the apparatus provided in this embodiment of the present application, the enhancing module is specifically configured to:
obtaining an importance coefficient of the second sub-feature map channel by using a Sigmoid activation function;
and scaling the second sub-feature map by using the importance coefficient to recalibrate the importance of the spatial domain feature on the channel of the second sub-feature map to obtain an enhanced third sub-feature map.
In a third aspect, an embodiment of the present application provides an electronic device, including: a memory and a processor;
the memory for storing a computer program;
wherein the processor executes the computer program in the memory to implement the method described in the first aspect and the various embodiments of the first aspect.
In a fourth aspect, the present application provides a computer-readable storage medium, in which a computer program is stored, and the computer program is used for implementing the method described in the first aspect and the implementation manners of the first aspect when executed by a processor.
Compared with the prior art, the feature map enhancement method, the device, the equipment and the medium of the convolutional neural network carry out convolution operation on an input original image to obtain a corresponding multilayer feature map, a certain layer of feature map is grouped according to channel dimensions to obtain a plurality of sub-feature maps, an embedded airspace grouping enhancement SGE module is utilized to carry out global average pooling and global maximum pooling parallel processing aiming at each sub-feature map to obtain two corresponding channel dimension vectors, an attention enhancement factor corresponding to each channel in the sub-feature maps is obtained according to the two corresponding channel dimension vectors, a corresponding enhancement feature map is obtained according to the attention enhancement factor and the corresponding sub-feature map, an enhancement feature map corresponding to the certain layer of feature map is obtained according to all enhancement feature maps, and therefore the attention enhancement factor of the channel dimensions is extracted through the global average pooling and the global maximum pooling operation, semantic information of importance degree among sub-feature graph channels can be better expressed, and meanwhile, a space domain grouping enhancement module structure is redesigned, so that the calculation of attention enhancement factors of the sub-feature graph channel dimensions is more effective, and the task performances of image classification, segmentation, detection and the like of the convolutional neural network are further improved.
Drawings
Fig. 1 is a schematic flowchart of a feature map enhancement method for a convolutional neural network according to an embodiment of the present disclosure;
FIG. 2 is a schematic diagram illustrating a flow of an algorithm for enhancing a sub-feature map according to an embodiment of the present application;
fig. 3 is a schematic structural diagram of a feature map enhancing apparatus of a convolutional neural network according to a second embodiment of the present application;
fig. 4 is a schematic structural diagram of an electronic device according to a third embodiment of the present application.
Detailed Description
The following detailed description of embodiments of the present application is provided in conjunction with the accompanying drawings, but it should be understood that the scope of the present application is not limited to the specific embodiments.
Throughout the specification and claims, unless explicitly stated otherwise, the word "comprise", or variations such as "comprises" or "comprising", will be understood to imply the inclusion of a stated element or component but not the exclusion of any other element or component.
Fig. 1 is a schematic flowchart of a feature map enhancement method for a convolutional neural network according to an embodiment of the present disclosure. In practical applications, the execution subject of this embodiment may be a feature map enhancing apparatus of a convolutional neural network, where the feature map enhancing apparatus of the convolutional neural network may be implemented by a virtual apparatus, such as a software code, or by an entity apparatus written with a relevant execution code, such as a usb disk, or by an entity apparatus integrated with a relevant execution code, such as a chip, a computer, and the like.
As shown in fig. 1, the method includes the following steps S101 to S106:
s101, performing convolution operation on the input original image to obtain a corresponding multilayer characteristic diagram.
In this embodiment, a pre-constructed convolutional neural network performs a multilayer convolution operation on an input original image, so as to obtain a corresponding multilayer feature map. It is understood that one of the layers of the feature map includes a certain number of channels.
And S102, grouping the feature graphs of a certain layer according to channel dimensions to obtain a plurality of sub-feature graphs.
In this embodiment, in the convolutional neural network learning process, the packet convolution may gradually capture a special semantic response, so that the response of a position of interest is larger, and other positions are not activated or have no response, and meanwhile, the amount of computation and parameter may also be reduced by using the packet convolution, so that the feature maps may be first grouped to better enhance the semantic feature information learning of a specific region, specifically, a certain number of channels in the feature maps are grouped to obtain a plurality of sub-feature maps with the same number as the grouped channels.
The output characteristic diagram after multilayer convolution is assumed to beWhere C is the number of channels in the feature map, H and W respectively represent the length and width of the feature map, and the feature map is first divided into G groups along the channel dimension, and then the vector of each spatial domain position of the obtained sub-feature map is denoted as X ═ X1,...,H×WWherein each element is
S103, aiming at each sub-feature graph, performing global average pooling and global maximum pooling parallel processing by using an embedded airspace grouping enhancement SGE module to obtain two corresponding channel dimension vectors.
And S104, obtaining the attention enhancement factor of each channel in the corresponding sub-feature graph according to the corresponding two channel dimension vectors.
In this embodiment, step S104 may be implemented as: reducing the dimension of the corresponding two channel dimension vectors by using 1 x 1 convolution; and activating and adding the two channel dimension vectors after dimension reduction by using a ReLU activation function to obtain the attention enhancement factor of each channel in the corresponding sub-feature map.
In practical application, some CNN network structures improve the feature expression capability of the model by introducing an attention mechanism, which not only tells the network model which important features to pay attention to, but also can enhance the expression capability of a specific region. But the way of cascading the attention enhancement modules in channel dimension and spatial dimension also increases the amount of computation and parameters of the network model.
Fig. 2 is a schematic flowchart of an algorithm for enhancing a sub-feature map according to an embodiment of the present disclosure. As shown in fig. 2, the left X column of the diagram represents that the feature maps are grouped into 3 sub-feature maps, and each sub-feature map is enhanced through the algorithm flow below the diagram to obtain the corresponding enhanced feature map in the right V column of the diagram.
The above algorithm flow is described in detail below. Considering the problems of calculation amount and model size, the method only uses a channel dimension attention mechanism, utilizes global average pooling to generate response to each spatial domain position of the sub-feature map, and combines global maximum pooling to only have gradient feedback to the position with maximum feature response during reverse propagation. And simultaneously redesigning a space domain grouping enhancement module structure, performing parallel processing on the sub-feature graphs by using global statistical feature information, and extracting attention enhancement factors of channel dimensions by using global average pooling and global maximum pooling respectively for expressing semantic information of the importance degree between the sub-feature graph channels.
The extraction process of the global statistical characteristic information is as follows:
wherein, a and b represent the channel dimension vector after the parallel processing of the global average pooling and the global maximum pooling, respectively, and max (·) represents the maximum operation of taking response to all positions of the channel dimension vector, so as to obtain the maximum activation information of each channel. Doing so means compressing the channel dimension of the grouped sub-feature maps from C H W to C H WMeanwhile, each value in the channel dimension vector represents the importance of each grouping sub-feature graph among channels by using the global statistical feature information.
It can be understood that in this embodiment, the global maximum pooling information is effectively incorporated into the attention enhancement factor calculation, and the semantic feature information expression capability of the spatial domain grouping enhancement module is enhanced.
In order to model the interdependency among the sub-feature diagram channels, the dimension of a channel dimension vector is reduced by using 1 × 1 convolution with a ReLU activation function, the nonlinear interaction capability of information among the channels is increased, and the calculation amount is reduced at the same time, wherein the expression is as follows:
e=ReLU(W1a) (3)
f=ReLU(W2b) (4)
wherein, W1And W2Weight matrices, denoted 1 × 1 convolution dimensionality reduction operation, respectivelyAndand the channel dimensions satisfy the relationship:
and adding the two channel dimension vectors to obtain the attention enhancement factor corresponding to each channel of the sub-feature map, wherein the expression is as follows:
and S105, obtaining a corresponding enhancer characteristic map according to the attention enhancement factor and the corresponding sub characteristic map.
In this embodiment, the step S105 may be implemented as: using 1 × 1 convolution to raise the attention enhancement factor to the number of channels corresponding to the sub-feature map; normalizing the attention enhancement factor using a SoftMax function; multiplying the normalized attention enhancement factor by the corresponding sub-feature map to obtain an enhanced first sub-feature map; regularizing the first sub-feature graph to obtain a second sub-feature graph; activating the second sub-feature graph by using a Sigmoid activation function to obtain an enhanced third sub-feature graph; and taking the third sub-feature map as an enhancer feature map.
To add more non-linearity to better fit complex relationships between channels, a 1 × 1 convolution upscaling operation is first used to bring the dimension of the attention-enhancing factor to the number of channels of the sub-feature map so that the number of channels of the sub-feature map matches the number of channels of the sub-feature map during subsequent weighted averaging processing. The expression of the attention enhancement factor after dimensionality raising and normalization is as follows:
For each spatial position in the sub-feature map, x is paired with the normalized attention-enhancing factor u as described aboveiAnd weighted averaging to obtain a sub-feature map enhanced by the attention mechanism channel, wherein the expression is as follows:
yi=u·xi (7)
further, the second sub-feature map is activated by using a Sigmoid activation function to obtain an enhanced third sub-feature map, which can be specifically realized as follows: obtaining an importance coefficient of the second sub-feature map channel by using a Sigmoid activation function; and scaling the second sub-feature map by using the importance coefficient to recalibrate the importance of the spatial domain feature on the channel of the second sub-feature map to obtain an enhanced third sub-feature map.
In this embodiment, in order to eliminate the interference of the amplitude difference between the samples to the result, the sub-feature map enhanced by the attention mechanism channel is processed by using regularization. For each spatial position of the sub-feature map, the expression of the regularization operation is as follows:
wherein z isiRepresenting a sub-feature graph, μ, after regularization of the channel dimensioncA mean value representing the sub-feature map,the variance of the sub-feature map is shown.
And then, obtaining an importance coefficient of the regularized sub-feature graph channel dimension by using a Sigmoid activation function, scaling the regularized sub-feature graph by using the importance coefficient, and re-calibrating the spatial domain feature importance of the sub-feature graph channel dimension. The expression describing this process is as follows:
vi=xi·Sigmoid(αzi+β) (11)
where α and β represent parameters of the scale operation and the panning operation, respectively, on the regularized sub-feature map, both of which have the same fixed value for each packet.
It can be understood that, in this embodiment, the spatial domain grouping enhancement module structure is redesigned, and the 1 × 1 convolution fusion pre-dimensionality reduction fusion post-dimensionality enhancement operation is adopted, so that not only is the amount of computation reduced, but also the information interaction capability of the sub-feature graph channel dimension is enhanced, and the probability of the channel dimension spatial domain feature importance degree obtained by using the SoftMax activation function is used to represent the attention enhancement factor.
And S106, obtaining an enhanced characteristic diagram corresponding to the characteristic layer of the certain layer according to all the enhanced characteristic diagrams.
According to the feature map enhancement method of the convolutional neural network, attention enhancement factors of channel dimensions are extracted through global average pooling and global maximum pooling, semantic information of importance degrees among sub-feature map channels can be better expressed, and meanwhile, a space domain grouping enhancement module structure is redesigned, so that the attention enhancement factors of the sub-feature map channel dimensions are calculated more effectively, and task performances of image classification, segmentation, detection and the like of the convolutional neural network are further improved.
The following are embodiments of the apparatus of the present application that may be used to perform embodiments of the method of the present application. For details which are not disclosed in the embodiments of the apparatus of the present application, reference is made to the embodiments of the method of the present application.
Fig. 3 is a schematic structural diagram of a feature map enhancing apparatus of a convolutional neural network according to a second embodiment of the present application, and as shown in fig. 3, the apparatus may include:
a convolution module 310, configured to perform convolution operation on an input original image to obtain a corresponding multi-layer feature map;
the grouping module 320 is configured to group feature maps of a certain layer according to channel dimensions to obtain a plurality of sub-feature maps;
the enhancement module 330 is configured to perform global average pooling and global maximum pooling parallel processing by using the embedded airspace grouping enhancement SGE module for each sub-feature map to obtain two corresponding channel dimension vectors; obtaining an attention enhancement factor of each channel in the corresponding sub-feature graph according to the corresponding two channel dimension vectors; obtaining a corresponding enhancer characteristic map according to the attention enhancement factor and the corresponding sub characteristic map;
and the output module 340 is configured to obtain an enhanced feature map corresponding to the certain layer of feature map according to all the enhanced feature maps.
According to the feature map enhancement device of the convolutional neural network, attention enhancement factors of channel dimensions are extracted through global average pooling and global maximum pooling, semantic information of importance degrees among sub-feature map channels can be better expressed, and meanwhile, a space domain grouping enhancement module structure is redesigned, so that the attention enhancement factors of the sub-feature map channel dimensions are calculated more effectively, and task performances of image classification, segmentation, detection and the like of the convolutional neural network are further improved.
In some embodiments, the enhancement module 330 is specifically configured to:
reducing the dimension of the corresponding two channel dimension vectors by using 1 x 1 convolution;
and activating and adding the two channel dimension vectors after dimension reduction by using a ReLU activation function to obtain the attention enhancement factor of each channel in the corresponding sub-feature map.
In some embodiments, the enhancement module 330 is specifically configured to:
using 1 × 1 convolution to raise the attention enhancement factor to the number of channels corresponding to the sub-feature map;
normalizing the attention enhancement factor using a SoftMax function;
multiplying the normalized attention enhancement factor by the corresponding sub-feature map to obtain an enhanced first sub-feature map;
regularizing the first sub-feature graph to obtain a second sub-feature graph;
activating the second sub-feature graph by using a Sigmoid activation function to obtain an enhanced third sub-feature graph;
and taking the third sub-feature map as an enhancer feature map.
In some embodiments, the enhancement module 330 is specifically configured to:
obtaining an importance coefficient of the second sub-feature map channel by using a Sigmoid activation function;
and scaling the second sub-feature map by using the importance coefficient to recalibrate the importance of the spatial domain feature on the channel of the second sub-feature map to obtain an enhanced third sub-feature map.
Fig. 4 is a schematic structural diagram of an electronic device according to a third embodiment of the present application, and as shown in fig. 4, the electronic device includes: a memory 401 and a processor 402;
a memory 401 for storing a computer program;
wherein the processor 402 executes the computer program in the memory 401 to implement the methods provided by the method embodiments as described above.
In an embodiment, the feature map enhancement apparatus of a convolutional neural network provided in the present application is exemplified by an electronic device. The processor may be a Central Processing Unit (CPU) or other form of processing unit having data processing capabilities and/or instruction execution capabilities, and may control other components in the electronic device to perform desired functions.
The memory may include one or more computer program products that may include various forms of computer-readable storage media, such as volatile memory and/or non-volatile memory. Volatile memory can include, for example, Random Access Memory (RAM), cache memory (or the like). The non-volatile memory may include, for example, Read Only Memory (ROM), a hard disk, flash memory, and the like. One or more computer program instructions may be stored on a computer-readable storage medium and executed by a processor to implement the methods of the various embodiments of the present application above and/or other desired functions. Various contents such as an input signal, a signal component, a noise component, etc. may also be stored in the computer-readable storage medium.
An embodiment of the present application provides a computer-readable storage medium, in which a computer program is stored, and the computer program is used for implementing the methods provided by the method embodiments described above when being executed by a processor.
In practice, the computer program in this embodiment may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, C + +, etc., and conventional procedural programming languages, such as the "C" programming language or similar programming languages, for performing the operations of the embodiments of the present application. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device and partly on a remote computing device, or entirely on the remote computing device or server.
In practice, the computer-readable storage medium may take any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. A readable storage medium may include, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium include: an electrical connection having one or more wires, a portable disk, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
The foregoing descriptions of specific exemplary embodiments of the present application have been presented for purposes of illustration and description. It is not intended to limit the application to the precise form disclosed, and obviously many modifications and variations are possible in light of the above teaching. The exemplary embodiments were chosen and described in order to explain certain principles of the present application and its practical application to enable one skilled in the art to make and use various exemplary embodiments of the present application and various alternatives and modifications thereof. It is intended that the scope of the application be defined by the claims and their equivalents.
Claims (6)
1. A feature map enhancement method for a convolutional neural network, comprising:
performing convolution operation on an input original image to obtain a corresponding multilayer characteristic diagram;
grouping feature graphs of a certain layer according to channel dimensions to obtain a plurality of sub-feature graphs;
aiming at each sub-feature graph, performing global average pooling and global maximum pooling parallel processing by using an embedded airspace grouping enhancement SGE module to obtain two corresponding channel dimension vectors;
obtaining an attention enhancement factor of each channel in the corresponding sub-feature graph according to the corresponding two channel dimension vectors;
obtaining a corresponding enhancer characteristic map according to the attention enhancement factor and the corresponding sub characteristic map;
obtaining an enhanced feature map corresponding to the feature map of the certain layer according to all the enhancer feature maps;
wherein, obtaining the attention enhancement factor of each channel in the corresponding sub-feature map according to the corresponding two channel dimension vectors comprises:
reducing the dimension of the corresponding two channel dimension vectors by using 1 x 1 convolution;
activating and adding the two channel dimension vectors after dimension reduction by using a ReLU activation function to obtain an attention enhancement factor corresponding to each channel in the sub-feature map;
wherein, obtaining a corresponding enhancer feature map according to the attention enhancing factor and the corresponding sub-feature map comprises:
using 1 × 1 convolution to raise the attention enhancement factor to the number of channels corresponding to the sub-feature map;
normalizing the attention enhancement factor using a SoftMax function;
multiplying the normalized attention enhancement factor by the corresponding sub-feature map to obtain an enhanced first sub-feature map;
regularizing the first sub-feature graph to obtain a second sub-feature graph;
activating the second sub-feature graph by using a Sigmoid activation function to obtain an enhanced third sub-feature graph;
and taking the third sub-feature map as an enhancer feature map.
2. The method according to claim 1, wherein said activating the second sub-feature map using a Sigmoid activation function, resulting in an enhanced third sub-feature map, comprises:
obtaining an importance coefficient of the second sub-feature map channel by using a Sigmoid activation function;
and scaling the second sub-feature map by using the importance coefficient to recalibrate the importance of the spatial domain feature on the channel of the second sub-feature map to obtain an enhanced third sub-feature map.
3. A feature map enhancement apparatus for a convolutional neural network, comprising:
the convolution module is used for carrying out convolution operation on the input original image to obtain a corresponding multilayer characteristic diagram;
the grouping module is used for grouping the characteristic diagrams of a certain layer according to the channel dimension to obtain a plurality of sub-characteristic diagrams;
the enhancement module is used for carrying out global average pooling and global maximum pooling parallel processing by utilizing the embedded airspace grouping enhancement SGE module aiming at each sub-feature map to obtain two corresponding channel dimension vectors; obtaining an attention enhancement factor of each channel in the corresponding sub-feature graph according to the corresponding two channel dimension vectors; obtaining a corresponding enhancer characteristic map according to the attention enhancement factor and the corresponding sub characteristic map;
the output module is used for obtaining an enhanced feature map corresponding to the feature map of the certain layer according to all the enhanced feature maps;
wherein, the enhancement module is specifically configured to:
reducing the dimension of the corresponding two channel dimension vectors by using 1 x 1 convolution;
activating and adding the two channel dimension vectors after dimension reduction by using a ReLU activation function to obtain an attention enhancement factor corresponding to each channel in the sub-feature map;
wherein, the enhancement module is specifically configured to:
using 1 × 1 convolution to raise the attention enhancement factor to the number of channels corresponding to the sub-feature map;
normalizing the attention enhancement factor using a SoftMax function;
multiplying the normalized attention enhancement factor by the corresponding sub-feature map to obtain an enhanced first sub-feature map;
regularizing the first sub-feature graph to obtain a second sub-feature graph;
activating the second sub-feature graph by using a Sigmoid activation function to obtain an enhanced third sub-feature graph;
and taking the third sub-feature map as an enhancer feature map.
4. The apparatus according to claim 3, wherein the enhancement module is specifically configured to:
obtaining an importance coefficient of the second sub-feature map channel by using a Sigmoid activation function;
and scaling the second sub-feature map by using the importance coefficient to recalibrate the importance of the spatial domain feature on the channel of the second sub-feature map to obtain an enhanced third sub-feature map.
5. An electronic device, comprising: a memory and a processor;
the memory for storing a computer program;
wherein the processor executes the computer program in the memory to implement the method of any one of claims 1-2.
6. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, is adapted to carry out the method of any one of claims 1-2.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910605387.6A CN110490813B (en) | 2019-07-05 | 2019-07-05 | Feature map enhancement method, device, equipment and medium for convolutional neural network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910605387.6A CN110490813B (en) | 2019-07-05 | 2019-07-05 | Feature map enhancement method, device, equipment and medium for convolutional neural network |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110490813A CN110490813A (en) | 2019-11-22 |
CN110490813B true CN110490813B (en) | 2021-12-17 |
Family
ID=68546677
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910605387.6A Active CN110490813B (en) | 2019-07-05 | 2019-07-05 | Feature map enhancement method, device, equipment and medium for convolutional neural network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110490813B (en) |
Families Citing this family (20)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111027670B (en) * | 2019-11-04 | 2022-07-22 | 重庆特斯联智慧科技股份有限公司 | Feature map processing method and device, electronic equipment and storage medium |
CN111161195B (en) * | 2020-01-02 | 2023-10-13 | 重庆特斯联智慧科技股份有限公司 | Feature map processing method and device, storage medium and terminal |
CN113222827A (en) * | 2020-01-21 | 2021-08-06 | 北京三星通信技术研究有限公司 | Image processing method, image processing device, electronic equipment and computer readable storage medium |
CN111274999B (en) * | 2020-02-17 | 2024-04-19 | 北京迈格威科技有限公司 | Data processing method, image processing device and electronic equipment |
CN113361529B (en) * | 2020-03-03 | 2024-05-10 | 北京四维图新科技股份有限公司 | Image semantic segmentation method and device, electronic equipment and storage medium |
CN111325751B (en) * | 2020-03-18 | 2022-05-27 | 重庆理工大学 | CT image segmentation system based on attention convolution neural network |
CN111444957B (en) * | 2020-03-25 | 2023-11-07 | 腾讯科技(深圳)有限公司 | Image data processing method, device, computer equipment and storage medium |
CN111539325A (en) * | 2020-04-23 | 2020-08-14 | 四川旅游学院 | Forest fire detection method based on deep learning |
CN111967478B (en) * | 2020-07-08 | 2023-09-05 | 特斯联科技集团有限公司 | Feature map reconstruction method, system, storage medium and terminal based on weight overturn |
CN112001248B (en) * | 2020-07-20 | 2024-03-01 | 北京百度网讯科技有限公司 | Active interaction method, device, electronic equipment and readable storage medium |
CN112149694B (en) * | 2020-08-28 | 2024-04-05 | 特斯联科技集团有限公司 | Image processing method, system, storage medium and terminal based on convolutional neural network pooling module |
CN112183645B (en) * | 2020-09-30 | 2022-09-09 | 深圳龙岗智能视听研究院 | Image aesthetic quality evaluation method based on context-aware attention mechanism |
CN112465828B (en) * | 2020-12-15 | 2024-05-31 | 益升益恒(北京)医学技术股份公司 | Image semantic segmentation method and device, electronic equipment and storage medium |
CN112668656B (en) * | 2020-12-30 | 2023-10-13 | 深圳市优必选科技股份有限公司 | Image classification method, device, computer equipment and storage medium |
CN112862667A (en) * | 2021-01-29 | 2021-05-28 | 成都商汤科技有限公司 | Pooling method, chip, equipment and storage medium |
CN112767406B (en) * | 2021-02-02 | 2023-12-12 | 苏州大学 | Deep convolution neural network training method for corneal ulcer segmentation and segmentation method |
CN113011465B (en) * | 2021-02-25 | 2021-09-03 | 浙江净禾智慧科技有限公司 | Household garbage throwing intelligent supervision method based on grouping multi-stage fusion |
CN113052173B (en) * | 2021-03-25 | 2024-07-19 | 岳阳市金霖昇行科技有限公司 | Sample data characteristic enhancement method and device |
CN113486898B (en) * | 2021-07-08 | 2024-05-31 | 西安电子科技大学 | Radar signal RD image interference identification method and system based on improvement ShuffleNet |
CN113920099B (en) * | 2021-10-15 | 2022-08-30 | 深圳大学 | Polyp segmentation method based on non-local information extraction and related components |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9953437B1 (en) * | 2017-10-18 | 2018-04-24 | StradVision, Inc. | Method and device for constructing a table including information on a pooling type and testing method and testing device using the same |
CN109299268A (en) * | 2018-10-24 | 2019-02-01 | 河南理工大学 | A kind of text emotion analysis method based on dual channel model |
CN109376804B (en) * | 2018-12-19 | 2020-10-30 | 中国地质大学(武汉) | Hyperspectral remote sensing image classification method based on attention mechanism and convolutional neural network |
-
2019
- 2019-07-05 CN CN201910605387.6A patent/CN110490813B/en active Active
Also Published As
Publication number | Publication date |
---|---|
CN110490813A (en) | 2019-11-22 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110490813B (en) | Feature map enhancement method, device, equipment and medium for convolutional neural network | |
Han et al. | Dynamic neural networks: A survey | |
KR102434726B1 (en) | Treatment method and device | |
Ball et al. | Comprehensive survey of deep learning in remote sensing: theories, tools, and challenges for the community | |
CN108351984B (en) | Hardware-efficient deep convolutional neural network | |
LeCun et al. | Deep learning tutorial | |
CN108681746B (en) | Image identification method and device, electronic equipment and computer readable medium | |
CN114556370A (en) | Method and system for training convolutional neural networks using built-in attention | |
CN109472209B (en) | Image recognition method, device and storage medium | |
Masoumi et al. | Spectral shape classification: A deep learning approach | |
Wang et al. | TRC‐YOLO: A real‐time detection method for lightweight targets based on mobile devices | |
Pawar et al. | Assessment of autoencoder architectures for data representation | |
US20240135174A1 (en) | Data processing method, and neural network model training method and apparatus | |
CN118643874A (en) | Method and device for training neural network | |
Hamouda et al. | Smart feature extraction and classification of hyperspectral images based on convolutional neural networks | |
US20230401838A1 (en) | Image processing method and related apparatus | |
Lu et al. | Generalized haar filter-based object detection for car sharing services | |
Li et al. | Robust and structural sparsity auto-encoder with L21-norm minimization | |
CN115878330A (en) | Thread operation control method and system | |
Chen et al. | HRCP: High-ratio channel pruning for real-time object detection on resource-limited platform | |
Ma et al. | Acceleration of multi‐task cascaded convolutional networks | |
US20240265503A1 (en) | Neural processing unit and artificial neural network system for image fusion | |
Yadav et al. | Design of CNN architecture for Hindi Characters | |
WO2023185209A1 (en) | Model pruning | |
Shah et al. | Convolutional neural network-based image segmentation techniques |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |