CN116129207A - Image data processing method for attention of multi-scale channel - Google Patents

Image data processing method for attention of multi-scale channel Download PDF

Info

Publication number
CN116129207A
CN116129207A CN202310414590.1A CN202310414590A CN116129207A CN 116129207 A CN116129207 A CN 116129207A CN 202310414590 A CN202310414590 A CN 202310414590A CN 116129207 A CN116129207 A CN 116129207A
Authority
CN
China
Prior art keywords
global
input data
data
channels
features
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202310414590.1A
Other languages
Chinese (zh)
Other versions
CN116129207B (en
Inventor
刘刚
王冰冰
周杰
王磊
史魁杰
曾辉
张金烁
胡莉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jiangxi Normal University
Original Assignee
Jiangxi Normal University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jiangxi Normal University filed Critical Jiangxi Normal University
Priority to CN202310414590.1A priority Critical patent/CN116129207B/en
Publication of CN116129207A publication Critical patent/CN116129207A/en
Application granted granted Critical
Publication of CN116129207B publication Critical patent/CN116129207B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/7715Feature extraction, e.g. by transforming the feature space, e.g. multi-dimensional scaling [MDS]; Mappings, e.g. subspace methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/806Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Multimedia (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Image Analysis (AREA)
  • Facsimile Image Signal Circuits (AREA)

Abstract

The invention discloses an image data processing method of multi-scale channel attention, which is characterized in that global features and local features in input data are extracted, so that a convolutional neural network is more concerned about the whole information and local detail features of the input data, and the problems of target aggregation and target shielding in complex scenes are relieved.

Description

Image data processing method for attention of multi-scale channel
Technical Field
The invention relates to the field of computer vision, in particular to an image data processing method of multi-scale channel attention.
Background
The channel attention mechanism can remarkably improve the expressive force and generalization capability of the model, has lower calculation cost and is easy to integrate into the existing convolutional neural network structure. Because of these advantages, the channel attention mechanism has also been widely used in the field of deep learning applications such as image classification, object detection, semantic segmentation, etc.
The essence of the channel attention mechanism is to weight average the characteristics of different channels, so that richer, stable and reliable characteristic expression is obtained.
Existing channel attentions include SE, ECA, CA, etc., which focus on only detail information in a certain local feature or semantic information in a global feature, but not both detail information and semantic information, resulting in feature expression of insufficient channel dimensions.
Disclosure of Invention
The invention aims to provide an image data processing method of multi-scale channel attention.
The invention aims to solve the problems that:
the image data processing method for the attention of the multi-scale channel is provided, global features and local features in input data are extracted, so that the convolutional neural network is more concerned about the whole information and local detail features of the input data, and the problems of target aggregation and target shielding in a complex scene are relieved.
The image data processing method of the multi-scale channel attention adopts the following technical scheme:
image data processing method for attention of multi-scale channel
S21: the method comprises the steps of performing digital processing on input data (an original image or a feature map), converting extracted features into digital, storing the digital data through tensor matrixes, and accelerating the convergence of a convolutional neural network through normalization processing;
s22: the method of combining the global channel attention mechanism and the local channel attention mechanism is used for carrying out feature extraction and feature fusion on input data;
s23: the global channel attention mechanism uses global average pooling, self-adaptively selects a one-dimensional convolution layer with convolution kernel size and a Sigmoid activation function, and the global channel attention can be used for self-adaptively adjusting the weights of different channels through global average pooling and element-by-element conversion of a feature map, so that a model can pay attention to more important features, and the classification performance and the robustness of the model are improved, wherein the calculation formula of the global average pooling is as follows:
Figure SMS_1
wherein->
Figure SMS_2
Representing global average pooling result,/->
Figure SMS_3
For an input image, the dimensions thereof are w×h×c, W, H and C represent the width, height and channel of the input image, respectively, and i and j represent pixel positions on the width and height, respectively;
the calculation formula of the self-adaptive selection is as follows:
Figure SMS_6
wherein->
Figure SMS_8
Convolution kernel size, representing one-dimensional convolution, +.>
Figure SMS_11
Indicates the number of channels>
Figure SMS_4
Meaning that k can only be odd,/or->
Figure SMS_9
And->
Figure SMS_12
For changing->
Figure SMS_13
And->
Figure SMS_5
The ratio of (in the present invention>
Figure SMS_7
And->
Figure SMS_10
Taking 2 and 1 respectively;
the Sigmoid activation function is also called an S-shaped growth curve, and the calculation formula is:
Figure SMS_14
wherein
Figure SMS_15
Is input;
s24: the multi-layer perceptron MLP realized by two-dimensional convolution is adopted in a local channel attention mechanism and is used for extracting local features, the MLP architecture is activated by two-dimensional convolutions with convolution kernel size of 1 and a middle ReLU function, the number of channels of input data is only changed after the input data is subjected to the two-dimensional convolutions, the number of output channels of the first convolution operation is one sixteenth of the number of input channels, the number of output channels of the second convolution operation is consistent with the number of channels of an embedded position, and the local channel attention can help a model to better capture local information in the input features;
s25: the ReLU function only retains positive elements and discards all negative elements by setting the corresponding activity value to 0;
s26: the output of the global attention and the output of the global attention are fused, the Sigmoid function is used for activating data to obtain final attention weight, and then the activated data and the input data are multiplied pixel by pixel;
s27: compressing existing data according to the range thereof by a Sigmoid function, and compressing any input to a certain value in a section (0, 1) so as to ensure normalization;
s28: the pixel-by-pixel multiplication of the input data with the activated data is performed to perform different location weighting operations on the input data, thereby focusing more on global features and local features.
Further, the input data is subjected to two-dimensional convolution in the step S24, and then only the number of channels is changed, and the attention among the channels is estimated in a manner of shrinking before expanding the channels of the input data in the whole MLP architecture, wherein the shrinkage coefficient is r, the feature scale after shrinkage is h×w×c/r, and the feature scale after expansion is h×w×c by using a ReLU activation function.
Further, in the steps S23 and S24, the global features and the local features in the input data are respectively extracted by the global average pooling manner in the global channel attention mechanism and the multi-layer perceptron MLP manner in the local channel attention mechanism, and the fusion operation is performed on the output of the global channel attention mechanism and the output of the local channel attention mechanism in the steps S23 and S24, namely, the feature fusion is performed on different features, through the step S26, so that the convolutional neural network focuses on the whole information and the local detail features of the input data, and the problems of target aggregation and target shielding in complex scenes are relieved.
The invention has the beneficial effects that: the problems of low detection precision, high omission rate and the like caused by the characteristics of large aggregation, serious shielding and the like of small target detection in a complex scene can be further alleviated by a multi-scale channel attention image data processing method, and the multi-scale channel attention image data processing method is used for extracting global features and local features in data and carrying out feature fusion on different features, so that a convolutional neural network is more concerned about the whole information and local detail features of input data, and the problems of target aggregation and target shielding in the complex scene are alleviated.
Drawings
FIG. 1 is a schematic diagram of a method for processing image data of attention of a multi-scale channel according to the present invention;
FIG. 2 is a linear diagram of ReLU function correction according to the invention;
FIG. 3 is a schematic diagram of the normalization of sigmoid function data in the present invention.
Detailed Description
The invention will be further clarified and fully described in connection with the accompanying drawings, to which the scope of protection of the invention is not limited.
Examples
As shown in fig. 1 to 3, a method for processing image data of attention of a multi-scale channel includes the steps of:
s21: the method comprises the steps of performing digital processing on input data (an original image or a feature map), converting extracted features into digital, storing the digital data through tensor matrixes, and accelerating the convergence of a convolutional neural network through normalization processing;
s22: using a method of combining a global channel attention mechanism and a local channel attention mechanism, as shown in fig. 1, performing feature extraction and feature fusion on input data;
s23: the global channel attention mechanism uses global average pooling, self-adaptively selects a one-dimensional convolution layer with convolution kernel size and a Sigmoid activation function, as shown in the left column of fig. 1, the global channel attention can be obtained by carrying out global average pooling and element-by-element transformation on a feature map, and the weights of different channels are self-adaptively adjusted, so that a model can pay attention to more important features, and the classification performance and the robustness of the model are improved, wherein the calculation formula of the global average pooling is as follows:
Figure SMS_16
wherein->
Figure SMS_17
Representing global average pooling result,/->
Figure SMS_18
For an input image, the dimensions thereof are w×h×c, W, H and C represent the width, height and channel of the input image, respectively, and i and j represent pixel positions on the width and height, respectively;
the calculation formula of the self-adaptive selection is as follows:
Figure SMS_19
wherein->
Figure SMS_23
Convolution kernel size, representing one-dimensional convolution, +.>
Figure SMS_26
Indicates the number of channels>
Figure SMS_20
Meaning that k can only be odd,/or->
Figure SMS_24
And->
Figure SMS_27
For changing->
Figure SMS_28
And->
Figure SMS_21
The ratio of (in the present invention>
Figure SMS_22
And->
Figure SMS_25
Taking 2 and 1 respectively;
the Sigmoid activation function is also called an S-shaped growth curve, and as shown in fig. 3, the calculation formula is:
Figure SMS_29
wherein->
Figure SMS_30
Is input;
s24: the multi-layer perceptron MLP realized by two-dimensional convolution is adopted in a local channel attention mechanism and is used for extracting local features, the MLP architecture is formed by two-dimensional convolutions with convolution kernel size of 1 and intermediate ReLU function activation, the ReLU function activation enables the output of a part of neurons to be 0, the interdependence relation of parameters is reduced, the occurrence of over-fitting problem is relieved, the channel number of input data is only changed after two-dimensional convolution, the output channel number of a first convolution operation is one sixteenth of the input channel number, the output channel number of a second convolution operation is consistent with the embedding position channel number, and the local channel attention can help a model to better capture local information in the input features, as shown in the right column of fig. 1;
s25: the ReLU function retains only positive elements and discards all negative elements by setting the corresponding activity value to 0, as shown in fig. 2;
s26: the output of the global attention and the output of the global attention are fused, the Sigmoid function is used for activating data to obtain final attention weight, and then the activated data and the input data are multiplied pixel by pixel;
s27: compression is performed through a Sigmoid function, which compresses the existing data to a certain value in the interval (0, 1) according to the range of the existing data so as to ensure normalization, as shown in fig. 1;
s28: the pixel-wise multiplication of the input data with the activated data is performed to perform different position weighting operations on the input data, thereby focusing more on global and local features as shown in fig. 1.
The input data is subjected to two-dimensional convolution in the step S24, and then only the number of channels is changed, and the attention among the channels of the input data is estimated in a manner of shrinking before expanding in the whole MLP architecture, wherein the shrinkage coefficient is r, the feature scale after shrinkage is h×w×c/r, and the feature scale after expansion is h×w×c by using a ReLU activation function.
The steps S23 and S24 respectively extract global features and local features in input data in a global average pooling mode in a global channel attention mechanism and a multi-layer perceptron MLP mode in a local channel attention mechanism, and the step S26 is used for carrying out fusion operation on the output of the global channel attention mechanism and the output of the local channel attention mechanism of the steps S23 and S24, namely carrying out feature fusion on different features, so that a convolutional neural network focuses on the whole information and local detail features of the input data, and the problems of target aggregation and target shielding in complex scenes are relieved.
The embodiments of the present invention are disclosed as preferred embodiments, but not limited thereto, and those skilled in the art will readily appreciate from the foregoing description that various extensions and modifications can be made without departing from the spirit of the present invention.

Claims (3)

1. A method of processing image data of a multi-scale channel attention, comprising the steps of:
s21: the method comprises the steps of performing digital processing on input data, namely an original image or a feature map, converting the extracted features into digital data, storing the digital data through tensor matrixes, and accelerating the convergence of a convolutional neural network through normalization processing;
s22: the method of combining the global channel attention mechanism and the local channel attention mechanism is used for carrying out feature extraction and feature fusion on input data;
s23: using global average pooling, a one-dimensional convolution layer with self-adaptive selection convolution kernel size and a Sigmoid activation function in a global channel attention mechanism, wherein the calculation formula of the global average pooling process is as follows:
Figure QLYQS_1
wherein->
Figure QLYQS_2
Representing global average pooling result,/->
Figure QLYQS_3
For an input image, the dimensions thereof are w×h×c, W, H and C represent the width, height and channel of the input image, respectively, and i and j represent pixel positions on the width and height, respectively;
the calculation formula of the self-adaptive selection is as follows:
Figure QLYQS_5
wherein->
Figure QLYQS_8
Convolution kernel size, representing one-dimensional convolution, +.>
Figure QLYQS_10
Indicates the number of channels>
Figure QLYQS_6
Meaning that k can only be odd,/or->
Figure QLYQS_7
And->
Figure QLYQS_9
For changing->
Figure QLYQS_11
And->
Figure QLYQS_4
The ratio between;
the Sigmoid activation function is also called an S-shaped growth curve, and the calculation formula is:
Figure QLYQS_12
wherein->
Figure QLYQS_13
Is input;
s24: the multi-layer perceptron MLP is realized by two-dimensional convolution in a local channel attention mechanism and is used for extracting local features, the MLP architecture is activated by two-dimensional convolutions with convolution kernel size of 1 and a middle ReLU function, the number of channels of input data is only changed after the input data is subjected to the two-dimensional convolutions, the number of output channels of a first convolution operation is one sixteenth of the number of input channels, and the number of output channels of a second convolution operation is consistent with the number of channels of an embedded position;
s25: the ReLU function only retains positive elements and discards all negative elements by setting the corresponding activity value to 0;
s26: the output of the global attention and the output of the global attention are fused, the Sigmoid function is used for activating data to obtain final attention weight, and then the activated data and the input data are multiplied pixel by pixel;
s27: compressing existing data according to the range thereof by a Sigmoid function, and compressing any input to a certain value in a section (0, 1) so as to ensure normalization;
s28: the pixel-by-pixel multiplication of the input data with the activated data is performed to perform different location weighting operations on the input data, thereby focusing more on global features and local features.
2. A method of processing image data for multi-scale channel attention as recited in claim 1, wherein,
the steps S23 and S24 respectively extract global features and local features in input data in a global average pooling mode in a global channel attention mechanism and a multi-layer perceptron MLP mode in a local channel attention mechanism, and the step S26 is used for carrying out fusion operation on the output of the global channel attention mechanism and the output of the local channel attention mechanism of the steps S23 and S24, namely carrying out feature fusion on different features, so that a convolutional neural network focuses on the whole information and local detail features of the input data, and the problems of target aggregation and target shielding in complex scenes are relieved.
3. The method according to claim 1, wherein the input data is subjected to two-dimensional convolution in step S24 to change only the number of channels, and the attention between channels is estimated for the channels of the input data in a first-shrink-then-expansion manner in the whole MLP architecture, wherein the shrinkage factor is r, the feature scale after shrinkage is hxw x C/r, and the feature scale after expansion is hxw x C using a ReLU activation function.
CN202310414590.1A 2023-04-18 2023-04-18 Image data processing method for attention of multi-scale channel Active CN116129207B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310414590.1A CN116129207B (en) 2023-04-18 2023-04-18 Image data processing method for attention of multi-scale channel

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310414590.1A CN116129207B (en) 2023-04-18 2023-04-18 Image data processing method for attention of multi-scale channel

Publications (2)

Publication Number Publication Date
CN116129207A true CN116129207A (en) 2023-05-16
CN116129207B CN116129207B (en) 2023-08-04

Family

ID=86301329

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310414590.1A Active CN116129207B (en) 2023-04-18 2023-04-18 Image data processing method for attention of multi-scale channel

Country Status (1)

Country Link
CN (1) CN116129207B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118094343A (en) * 2024-04-23 2024-05-28 安徽大学 Attention mechanism-based LSTM machine residual service life prediction method
CN118397281A (en) * 2024-06-24 2024-07-26 湖南工商大学 Image segmentation model training method, segmentation method and device based on artificial intelligence

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180231871A1 (en) * 2016-06-27 2018-08-16 Zhejiang Gongshang University Depth estimation method for monocular image based on multi-scale CNN and continuous CRF
CN110853051A (en) * 2019-10-24 2020-02-28 北京航空航天大学 Cerebrovascular image segmentation method based on multi-attention dense connection generation countermeasure network
CN111489358A (en) * 2020-03-18 2020-08-04 华中科技大学 Three-dimensional point cloud semantic segmentation method based on deep learning
CN112017198A (en) * 2020-10-16 2020-12-01 湖南师范大学 Right ventricle segmentation method and device based on self-attention mechanism multi-scale features
CN112784764A (en) * 2021-01-27 2021-05-11 南京邮电大学 Expression recognition method and system based on local and global attention mechanism
CN113627295A (en) * 2021-07-28 2021-11-09 中汽创智科技有限公司 Image processing method, device, equipment and storage medium
CN114842553A (en) * 2022-04-18 2022-08-02 安庆师范大学 Behavior detection method based on residual shrinkage structure and non-local attention
CN115240201A (en) * 2022-09-21 2022-10-25 江西师范大学 Chinese character generation method for alleviating network mode collapse problem by utilizing Chinese character skeleton information
CN115761258A (en) * 2022-11-10 2023-03-07 山西大学 Image direction prediction method based on multi-scale fusion and attention mechanism
CN115880225A (en) * 2022-11-10 2023-03-31 北京工业大学 Dynamic illumination human face image quality enhancement method based on multi-scale attention mechanism

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180231871A1 (en) * 2016-06-27 2018-08-16 Zhejiang Gongshang University Depth estimation method for monocular image based on multi-scale CNN and continuous CRF
CN110853051A (en) * 2019-10-24 2020-02-28 北京航空航天大学 Cerebrovascular image segmentation method based on multi-attention dense connection generation countermeasure network
CN111489358A (en) * 2020-03-18 2020-08-04 华中科技大学 Three-dimensional point cloud semantic segmentation method based on deep learning
CN112017198A (en) * 2020-10-16 2020-12-01 湖南师范大学 Right ventricle segmentation method and device based on self-attention mechanism multi-scale features
CN112784764A (en) * 2021-01-27 2021-05-11 南京邮电大学 Expression recognition method and system based on local and global attention mechanism
CN113627295A (en) * 2021-07-28 2021-11-09 中汽创智科技有限公司 Image processing method, device, equipment and storage medium
CN114842553A (en) * 2022-04-18 2022-08-02 安庆师范大学 Behavior detection method based on residual shrinkage structure and non-local attention
CN115240201A (en) * 2022-09-21 2022-10-25 江西师范大学 Chinese character generation method for alleviating network mode collapse problem by utilizing Chinese character skeleton information
CN115761258A (en) * 2022-11-10 2023-03-07 山西大学 Image direction prediction method based on multi-scale fusion and attention mechanism
CN115880225A (en) * 2022-11-10 2023-03-31 北京工业大学 Dynamic illumination human face image quality enhancement method based on multi-scale attention mechanism

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
ABHINAV SAGAR ET.AL: "DMSANet: Dual Multi Scale Attention Network", 《ARXIV:2106.08382V2 [CS.CV]》, pages 1 - 10 *
GANG LIU ET.AL: "Multiple Dirac Points and Hydrogenation-Induced Magnetism of Germanene Layer on Al (111) Surface", 《 JOURNAL OF PHYSICAL CHEMISTRY LETTE》, pages 4936 - 4942 *
章予希: "基于多尺度特征联合注意力的声纹识别模型研究", 《中国优秀硕士学位论文全文数据库 信息科技辑》, pages 136 - 362 *
高丹;陈建英;谢盈;: "A-PSPNet:一种融合注意力机制的PSPNet图像语义分割模型", 中国电子科学研究院学报, no. 06, pages 28 - 33 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118094343A (en) * 2024-04-23 2024-05-28 安徽大学 Attention mechanism-based LSTM machine residual service life prediction method
CN118397281A (en) * 2024-06-24 2024-07-26 湖南工商大学 Image segmentation model training method, segmentation method and device based on artificial intelligence

Also Published As

Publication number Publication date
CN116129207B (en) 2023-08-04

Similar Documents

Publication Publication Date Title
CN116129207B (en) Image data processing method for attention of multi-scale channel
CN111462126B (en) Semantic image segmentation method and system based on edge enhancement
US20220230282A1 (en) Image processing method, image processing apparatus, electronic device and computer-readable storage medium
CN113011329B (en) Multi-scale feature pyramid network-based and dense crowd counting method
CN111652081B (en) Video semantic segmentation method based on optical flow feature fusion
CN112232134B (en) Human body posture estimation method based on hourglass network and attention mechanism
CN114387521B (en) Remote sensing image building extraction method based on attention mechanism and boundary loss
CN110569851A (en) real-time semantic segmentation method for gated multi-layer fusion
CN113361493B (en) Facial expression recognition method robust to different image resolutions
CN115620118A (en) Saliency target detection method based on multi-scale expansion convolutional neural network
Wang et al. TF-SOD: a novel transformer framework for salient object detection
Mo et al. PVDet: Towards pedestrian and vehicle detection on gigapixel-level images
CN113327254A (en) Image segmentation method and system based on U-type network
CN115984747A (en) Video saliency target detection method based on dynamic filter
CN112581423A (en) Neural network-based rapid detection method for automobile surface defects
CN112950505B (en) Image processing method, system and medium based on generation countermeasure network
CN111488839B (en) Target detection method and target detection system
CN117557779A (en) YOLO-based multi-scale target detection method
CN111402140A (en) Single image super-resolution reconstruction system and method
CN116597144A (en) Image semantic segmentation method based on event camera
CN116246109A (en) Multi-scale hole neighborhood attention computing backbone network model and application thereof
CN113810597A (en) Rapid image and scene rendering method based on semi-prediction filtering
CN113222016A (en) Change detection method and device based on cross enhancement of high-level and low-level features
CN108629737B (en) Method for improving JPEG format image space resolution
CN118397192B (en) Point cloud analysis method based on double-geometry learning and self-adaptive sparse attention

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant