CN112329778A - Semantic segmentation method for introducing feature cross attention mechanism - Google Patents

Semantic segmentation method for introducing feature cross attention mechanism Download PDF

Info

Publication number
CN112329778A
CN112329778A CN202011144252.3A CN202011144252A CN112329778A CN 112329778 A CN112329778 A CN 112329778A CN 202011144252 A CN202011144252 A CN 202011144252A CN 112329778 A CN112329778 A CN 112329778A
Authority
CN
China
Prior art keywords
features
attention
module
feature
spatial
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011144252.3A
Other languages
Chinese (zh)
Inventor
彭思齐
曾海波
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xiangtan University
Original Assignee
Xiangtan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xiangtan University filed Critical Xiangtan University
Priority to CN202011144252.3A priority Critical patent/CN112329778A/en
Publication of CN112329778A publication Critical patent/CN112329778A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • G06V10/267Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Computational Linguistics (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The method aims at the problems that the Deeplabv3+ model is inaccurate in image target edge segmentation, image feature fitting is slow, and attention information cannot be effectively utilized. It is proposed to add a feature cross attention module to the model, the cross attention network consisting of two branches and a feature cross attention module. The shallow branch is used for extracting low-level spatial information, and the deep branch is used for extracting high-level contextual features, so that important features are extracted more finely. The method designs and realizes the connection of a feature cross attention mechanism and an encoding module of the Deeplabv3+, and inputs the output features of the Deeplabv3+ encoding module into the feature cross attention module to carry out convolution operation so as to realize the recalibration of the original features. The decoding module of deplapv 3+ acquires the spatial feature and the channel feature from the two branches, respectively, and then fuses the acquired features to acquire the more important features. The improved model is verified through a Pascal Voc2012 data set, and the result shows that the model added with the feature cross attention mechanism can effectively improve the defects of the original model, can more finely divide the target, and better solves the problems of rough dividing boundary and the like.

Description

Semantic segmentation method for introducing feature cross attention mechanism
Technical Field
The invention belongs to the field of semantic segmentation, relates to a semantic segmentation model introducing an attention mechanism, and particularly relates to a model introducing a double-attention mechanism method into Deeplabv3 +.
Background
At present, the convolutional neural network greatly promotes the execution of visual tasks by virtue of rich representation capability, and image semantic segmentation is one of key tasks for promoting computer vision. Image semantic segmentation has been an important research direction as a classic computer vision problem (image classification, object recognition detection, semantic segmentation), and its essence is to classify pixel points in pictures. The image semantic segmentation is widely applied to the related fields of automatic driving, ground feature classification, cloud detection, medical detection and the like. In the image semantic segmentation method based on the Pascal Voc2012 data set, the current popular models include FCN, U-Net and Deeplab series, and the FCN, U-Net and Deeplab series have the problems that the edge modification of the segmentation result is less, the boundary of partial image segmentation is rough, the relation between long-distance pixel classes cannot be fully utilized and the like.
Disclosure of Invention
In view of the above problems, the present invention proposes a semantic segmentation method based on a cross attention mechanism to solve the above problems or at least partially solve the above drawbacks of the semantic segmentation method.
The semantic segmentation method of the cross attention mechanism provided by the invention comprises the following steps of;
the invention provides a model for introducing a Deeplabv3+ into a cross attention mechanism, wherein the cross attention model consists of a space attention module and a channel attention module;
the channel attention module extracts pixel information extracted from a high-layer convolution layer in the Deeplabv3+ model and is used for extracting deep spatial information;
learning the feature weight by using a space attention module through a network according to the loss;
the spatial attention module is used for endowing important feature map with large weight, and the model is trained in a mode of invalid or unimportant feature map with small weight to achieve better result;
the method for extracting the attention of the feature channel is basically similar to SEnet, a maxpool feature extraction method is added on the basis of SEnet, and the final output result is obtained by adding the average pooling result and the maximum pooling result;
the method for extracting the characteristics with the Avgpool is the same as the method for extracting the Avgpool in SENEt;
when the two pools are used, a shared MLP is used for attention inference to save parameters, and the two aggregated channel features are located in the same semantic embedding space;
each channel of the channel attention module features represents a special detector, and the channel attention is concerned about what features are meaningful;
in order to summarize spatial characteristics, the invention adopts two modes of global average pooling and maximum pooling to respectively utilize different information, and the operation process of the channel attention module is as follows.
Figure BDA0002739193960000011
Wherein: MLP is a multi-sensing layer; σ is the sigmoid activation function. The input is a feature F of H × W × C;
firstly, respectively carrying out global average pooling and maximum pooling in a space to obtain two 1 × 1 × C channel descriptions;
respectively sending the neurons into a neural network with two layers, wherein the number of neurons in the first layer is C/r, and the activation function is Relu;
the number of neurons in the second layer is C, and the neural networks in the two layers are shared;
adding the two obtained characteristics, and obtaining a weight coefficient Mc through a Sigmoid activation function;
finally, multiplying the weight coefficient by the original characteristic F to obtain a new characteristic after zooming;
the spatial attention module extracts pixel content information extracted from a low-layer convolution layer in a Deeplabv3+ model and is used for extracting shallow spatial information;
spatial attention mechanisms are where meaningful features are of concern;
a spatial attention module, giving a H × W × C feature F;
firstly, respectively carrying out average pooling and maximum pooling of one channel dimension to obtain two HxWx1 channel descriptions, and splicing the two descriptions together according to the channel;
using the extracted features to get the right through a convolution layer with convolution kernel size of 7 multiplied by 7, the activation function is Sigmoid, and obtaining the weight coefficient Ms;
finally multiplying the weighting coefficient by the characteristic F' to obtain a new characteristic;
the operation formula of the space attention module is as follows;
Figure BDA0002739193960000021
wherein: σ is a sigmoid activation function; f is a convolutional layer; [; is the join profile in the channel dimension.
The feature cross attention module of the invention extracts shallower spatial information with a spatial attention module and then captures contextual information using a channel attention mechanism;
the two branch output characteristics of the invention are different, the characteristics of the high layer mainly comprise category information, so that the channel attention module can be used for extracting the characteristics of the high layer information, and the low layer corresponds to more space information;
the feature cross attention module can not directly perform sampling fusion on the low-level features, and a space attention module can be used for performing feature extraction on the low-level features;
our added FCA's high-level features of its channel attention module are used to provide context information, while the low-level features extracted by the spatial attention module are used to refine pixel localization;
firstly, cascading the output characteristics of the two branches, and performing convolution, batch processing normalization and ReLU unit processing on the cascaded characteristics;
then, the output of the features and the spatial branches fused by the SA module is used as input to help refine positioning;
after the characteristics of the SA module are subjected to normalization and S-type nonlinear convolution, multiplying the characteristics by the characteristics after fusion;
applying the space output by the context peak value to a channel attention block, compressing the context characteristics along the space dimension through global pooling and maximum pooling to obtain two vectors;
sharing the two vectors to a full connection layer and a Sigmoid operator to generate an attention diagram, and finally performing convolution, batch normalization and Relu unit fusion;
the semantic segmentation model of the feature cross attention module has better effects of extracting the target edge information and the content information:
1. the introduction of a characteristic cross attention mechanism into the DeeplabV3+ model is proposed, and the Deeplabv3+ model based on the cross attention mechanism is proposed.
2. The significance degree of pixel features is distinguished by emphasizing meaningful feature information on channel and space dimensions and carrying out convolution operation to redistribute weights, the more important the pixel features are, the more important the obtained weights are, and then the segmentation of the image is obtained through the joint learning of a main branch and a cross attention module.
3. The attention mechanism is a simple and effective lightweight module, and adding this module adds little additional computation.
4. After Deeplabv3+ is introduced, due to the fact that important information is selectively concerned by an attention mechanism, network areas of the improved network are divided more accurately, ideal target areas can be divided, the edges of objects can be accurately divided, and the problem that semantic division and marking are unreasonable is effectively solved.
Drawings
The invention is further illustrated with reference to the following figures and examples.
FIG. 1 is a schematic model diagram of Deeplabv3+
FIG. 2 is a block diagram of a channel attention mechanism
FIG. 3 is a block diagram of a space-based mechanism
FIG. 4 is a block diagram of a feature cross attention mechanism
Fig. 5 is a schematic diagram of a deplaybv 3+ model with a cross attention mechanism
Fig. 6, 7 and 8 are graphs of the test results of the deplab v3+ model providing a mechanism for drawing attention in the practice of the present invention.
Detailed Description
In order to make the purpose and technical solution of the present invention more clearly understood, the following detailed description is made with reference to the accompanying drawings and examples, and the application principle of the present invention is described in detail.
The embodiment of the invention provides a Deeplabv3+ model diagram introducing a feature cross attention mechanism. Fig. 5 shows a schematic diagram of a modified depllabv 3+ model, and a specific operation flow is shown in fig. 5.
The Deeplabv3+ model is rewritten based on the Xconcept network, and the final full-connection layer is removed firstly to realize the end-to-end output.
The last two pooling layers of the Xception network are removed because the convolution itself has a translational invariance and the pooling layers can further enhance this property of the network because the pooling layers themselves are a process of blurring the location. Semantic segmentation is an end-to-end problem, each pixel needs to be accurately classified, the position of the pixel is sensitive, too many posing are used, the size of a feature layer is too small, included features are too sparse, the semantic segmentation is not facilitated, and only a part of posing needs to be removed.
The method has the advantages of increasing the density of the features, enlarging the receptive field and improving the classification precision by using the conditional random field CRF.
An ASPP (advanced Spatial Pyramid clustering) structure is adopted, and the structure uses hole convolution operations with different sampling rates to perform parallel sampling on an input feature map, namely, the feature map is subjected to multi-scale capture of image context information.
The ASPP is improved by using a 1 × 1 convolution, that is, when the rate is increased, a degenerate version of a 3 × 3 convolution is used instead of the 3 × 3 convolution to reduce the number of parameters, and another point is to increase image output, which may be called global pooling, to supplement global features.
All convolutional and pooled layers were replaced with depth separable convolutions, using BN and ReLU after each 3 x 3 depth separable convolution.
The method for extracting the attention of the feature channel is basically similar to SEnet, the feature extraction method of maxpool is added on the basis of SEnet, and the final output result is that the average pooling result and the maximum pooling result are added and output. The extraction of features with Avgpool is the same as in the extraction of Avgpool in SENet. Furthermore, when using the two pools, a shared MLP is used for attention inference to save parameters, and the two aggregated channel features are both located in the same semantic embedding space. Each channel of the channel attention module features represents a special detector, and it makes sense to what features the channel attention is focused on. In order to summarize spatial features, two modes of global average pooling and maximum pooling are adopted to respectively utilize different information, and the operation process of the channel attention module is shown in the following formula.
Figure BDA0002739193960000031
The spatial attention mechanism of the invention is concerned about where the meaningful features are, the way of extracting the attention of the feature channel is to give a H multiplied by W multiplied by C feature F, firstly, average pooling and maximum pooling of one channel dimension are respectively carried out to obtain two H multiplied by W multiplied by 1 channel descriptions, and the two descriptions are spliced together according to the channels. Then, the activation function is Sigmoid through a convolution layer with convolution kernel size of 7 × 7, and a weight coefficient Ms is obtained. Finally, multiplying the weighting coefficient by the characteristic F' is the new characteristic. The operation process is shown in the following formula.
Figure BDA0002739193960000032
The Deeplabv3+ model added with the feature cross attention mechanism is mainly realized by an encoder and a decoder, wherein the encoder is divided into a deep separation convolution layer and an ASPP layer, the decoder fuses low-layer features and recovers a feature map, and the separable convolution is discussed, so that the proposed model is faster and stronger, and the computational complexity of the proposed model is obviously reduced.
Feature cross attention module extracts shallow spatial information using a spatial attention module fig. 3, and a model diagram of information using a channel attention mechanism to capture context is shown in fig. 2.
The output characteristics of the two branches are different, because the characteristics of the high layer mainly comprise the category information, the channel attention module can be used for extracting the characteristics of the high layer information, the low layer corresponds to more space information and cannot directly sample and fuse the space information, and the space attention module can be used for extracting the characteristics of the low layer.
The added F-cross attention mechanism has the high-level features of its channel attention module used to provide context information, while the low-level features extracted by the spatial attention module are used to refine the pixel localization. Firstly, the output characteristics of two branches are cascaded, the cascaded characteristics are subjected to convolution, batch processing normalization and ReLU unit processing,
and then the fused features of the SA module and the output of the spatial branch are used as input to help refine positioning. The features of the SA module are multiplied by the fused features after normalization and S-type nonlinear convolution. And applying the space output by the context peak value to a channel attention block, and compressing the context features along the space dimension through global pooling and maximum pooling to obtain two vectors. And then sharing the two vectors to a full connection layer and a Sigmoid operator to generate an attention diagram, and finally performing convolution, batch normalization and Relu unit fusion.
In order to better mine the spatial features and channel features in the decoder, after performing hole convolution, a shallow feature is extracted by using 1 × 1 convolution, and then an SA attention module is added to obtain better shallow spatial information.
A channel attention mechanism is added behind the feature obtained by four times of up-sampling of feature information obtained after ASPP operation to obtain context channel information of a higher layer, an adding module has little influence on the structure of an original network model, additional training parameters and overhead are hardly added, and the model can obtain more important spatial features and channel features.
In the FCA module, the output signatures of the two branches are first concatenated and then a 3 × 3 convolution, batch normalization and replay unit is applied to the concatenated signature.
The spatial branch characteristics are subjected to 3 x 3 convolution, batch normalization and Sigmoid nonlinearity, and then multiplied by fusion characteristics, and the output of the spatial attention block and the context characteristics of the context branch are applied to the channel attention block.
And compressing the context characteristics along the space dimension through the global pool and the maximum pool to obtain two vectors. These two vectors are then applied to the shared fully connected layer and Sigmoid operator to generate the attention map. The attention map is next multiplied by the output features from the spatial attention block and added to the fused features.
The application effect of the invention is described in detail by combining a Matlab/simulink simulation diagram as follows:
the visualization results are shown in fig. 6, 7 and 8, and it can be seen that when the overall, marginal and detailed aspects of our model are better than those of the original model, the network with the cross attention mechanism added in the marginal detail aspect can well learn and utilize the information in the target region and aggregate features from the target region, and the feature refinement process of our model finally guides the network to reasonably utilize the given features. The proposed Deeplabv3+ model introducing the cross attention module can refine the characteristics of the attention mechanism to two different modules, realize better performance improvement under the condition of keeping smaller calculation amount, design a double-branch network to improve the context characteristics, and simultaneously encode low-level spatial information.

Claims (6)

1. A deplab 3+ model incorporating a feature cross attention module (FCA), comprising;
a feature cross attention module (FCA) extracts shallower spatial information with a spatial attention module;
capturing context information using a channel attention mechanism;
the output characteristics of the two branches are different, and the characteristics of the high layer mainly comprise category information;
the channel attention module is used for extracting features of the high-level information, the low-level information corresponds to more spatial information and cannot be directly sampled and fused, and the spatial attention module is used for extracting features of the low-level information.
2. The FCA module added according to the method of claim 1, comprising;
its high-level features of the channel attention module are used to provide context information, while the low-level features extracted by the spatial attention module are used to refine pixel localization;
the output characteristics of the two branches are cascaded, and the cascaded characteristics are subjected to convolution, batch normalization and ReLU unit processing;
and the fused features of the SA module and the output of the spatial branch are used as input to help refine positioning.
3. The method of claims 1, 2 further comprising;
after the characteristics of the SA module are subjected to normalization and S-type nonlinear convolution, multiplying the characteristics by the characteristics after fusion;
applying the space output by the context peak value to a channel attention block, compressing the context characteristics along the space dimension through global pooling and maximum pooling to obtain two vectors;
and sharing the two vectors to a full-connection layer and a Sigmoid operator to generate an attention diagram, and finally performing convolution, batch normalization and Relu unit fusion.
4. The method of claim 3 further comprising;
after the space characteristics and the channel characteristics in a decoder are better mined, after the original image is subjected to cavity convolution, the shallow layer characteristics are extracted by using 1 multiplied by 1 convolution, and then an SA attention module is added to obtain better shallow layer space information;
and adding a channel attention mechanism behind the feature obtained by four times of up-sampling of the feature information obtained after the ASPP operation is performed on the original image to obtain context channel information of a higher layer.
5. The method of claim 4 further comprising;
the added module has no influence on the structure of the original network model, and extra training parameters and expenses are not added, but the model can obtain more important spatial features and channel features;
in the FCA module, the output features of the two branches are first concatenated, and then the 3 × 3 convolution, batch normalization and replay units are applied to the concatenated features;
after the spatial branch feature is subjected to 3 × 3 convolution, batch normalization and Sigmoid nonlinearity, the spatial branch feature is multiplied by the fusion feature.
6. The method of claim 1 comprising;
applying the output of the spatial attention block and the context characteristics of the context branch to the channel attention block;
compressing the context characteristics along the space dimension of the global pool and the maximum pool to obtain two vectors;
these two vectors are applied to the shared fully connected layer and Sigmoid operator to generate the attention map. The attention map is next multiplied by the output features from the spatial attention block and added to the fused features.
CN202011144252.3A 2020-10-23 2020-10-23 Semantic segmentation method for introducing feature cross attention mechanism Pending CN112329778A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011144252.3A CN112329778A (en) 2020-10-23 2020-10-23 Semantic segmentation method for introducing feature cross attention mechanism

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011144252.3A CN112329778A (en) 2020-10-23 2020-10-23 Semantic segmentation method for introducing feature cross attention mechanism

Publications (1)

Publication Number Publication Date
CN112329778A true CN112329778A (en) 2021-02-05

Family

ID=74311590

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011144252.3A Pending CN112329778A (en) 2020-10-23 2020-10-23 Semantic segmentation method for introducing feature cross attention mechanism

Country Status (1)

Country Link
CN (1) CN112329778A (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113435253A (en) * 2021-05-31 2021-09-24 西安电子科技大学 Multi-source image combined urban area ground surface coverage classification method
CN113435578A (en) * 2021-06-25 2021-09-24 重庆邮电大学 Feature map coding method and device based on mutual attention and electronic equipment
CN113516022A (en) * 2021-04-23 2021-10-19 黑龙江机智通智能科技有限公司 Fine-grained classification system for cervical cells
CN113989234A (en) * 2021-10-28 2022-01-28 杭州中科睿鉴科技有限公司 Image tampering detection method based on multi-feature fusion
CN114119698A (en) * 2021-06-18 2022-03-01 湖南大学 Unsupervised monocular depth estimation method based on attention mechanism
CN114972130A (en) * 2022-08-02 2022-08-30 深圳精智达技术股份有限公司 Training method, device and training equipment for denoising neural network
CN116503428A (en) * 2023-06-27 2023-07-28 吉林大学 Image feature extraction method and segmentation method based on refined global attention mechanism

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113516022A (en) * 2021-04-23 2021-10-19 黑龙江机智通智能科技有限公司 Fine-grained classification system for cervical cells
CN113435253A (en) * 2021-05-31 2021-09-24 西安电子科技大学 Multi-source image combined urban area ground surface coverage classification method
CN113435253B (en) * 2021-05-31 2022-12-02 西安电子科技大学 Multi-source image combined urban area ground surface coverage classification method
CN114119698A (en) * 2021-06-18 2022-03-01 湖南大学 Unsupervised monocular depth estimation method based on attention mechanism
CN113435578A (en) * 2021-06-25 2021-09-24 重庆邮电大学 Feature map coding method and device based on mutual attention and electronic equipment
CN113435578B (en) * 2021-06-25 2022-04-05 重庆邮电大学 Feature map coding method and device based on mutual attention and electronic equipment
CN113989234A (en) * 2021-10-28 2022-01-28 杭州中科睿鉴科技有限公司 Image tampering detection method based on multi-feature fusion
CN114972130A (en) * 2022-08-02 2022-08-30 深圳精智达技术股份有限公司 Training method, device and training equipment for denoising neural network
CN116503428A (en) * 2023-06-27 2023-07-28 吉林大学 Image feature extraction method and segmentation method based on refined global attention mechanism
CN116503428B (en) * 2023-06-27 2023-09-08 吉林大学 Image feature extraction method and segmentation method based on refined global attention mechanism

Similar Documents

Publication Publication Date Title
CN112329778A (en) Semantic segmentation method for introducing feature cross attention mechanism
CN112541503B (en) Real-time semantic segmentation method based on context attention mechanism and information fusion
CN109190752B (en) Image semantic segmentation method based on global features and local features of deep learning
CN109543502B (en) Semantic segmentation method based on deep multi-scale neural network
CN108171701B (en) Significance detection method based on U network and counterstudy
CN111563909B (en) Semantic segmentation method for complex street view image
CN113408321B (en) Real-time target detection method and device for lightweight image and video data
CN114943876A (en) Cloud and cloud shadow detection method and device for multi-level semantic fusion and storage medium
CN113011336B (en) Real-time street view image semantic segmentation method based on deep multi-branch aggregation
CN114463340B (en) Agile remote sensing image semantic segmentation method guided by edge information
Zeng et al. Deeplabv3+ semantic segmentation model based on feature cross attention mechanism
CN116740516A (en) Target detection method and system based on multi-scale fusion feature extraction
CN114120148B (en) Method for detecting changing area of remote sensing image building
Liu et al. Road segmentation with image-LiDAR data fusion in deep neural network
CN114299305B (en) Saliency target detection algorithm for aggregating dense and attention multi-scale features
CN116051977A (en) Multi-branch fusion-based lightweight foggy weather street view semantic segmentation algorithm
CN117541505A (en) Defogging method based on cross-layer attention feature interaction and multi-scale channel attention
CN115995002B (en) Network construction method and urban scene real-time semantic segmentation method
CN112418229A (en) Unmanned ship marine scene image real-time segmentation method based on deep learning
CN116363361A (en) Automatic driving method based on real-time semantic segmentation network
CN116246109A (en) Multi-scale hole neighborhood attention computing backbone network model and application thereof
CN113627368B (en) Video behavior recognition method based on deep learning
CN113284042B (en) Multi-path parallel image content characteristic optimization style migration method and system
Yi et al. Gated residual feature attention network for real-time Dehazing
Yanqin et al. Crowd density estimation based on conditional random field and convolutional neural networks

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20210205