CN116129207A - Image data processing method for attention of multi-scale channel - Google Patents
Image data processing method for attention of multi-scale channel Download PDFInfo
- Publication number
- CN116129207A CN116129207A CN202310414590.1A CN202310414590A CN116129207A CN 116129207 A CN116129207 A CN 116129207A CN 202310414590 A CN202310414590 A CN 202310414590A CN 116129207 A CN116129207 A CN 116129207A
- Authority
- CN
- China
- Prior art keywords
- global
- input data
- data
- channels
- features
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/7715—Feature extraction, e.g. by transforming the feature space, e.g. multi-dimensional scaling [MDS]; Mappings, e.g. subspace methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/80—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
- G06V10/806—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V2201/00—Indexing scheme relating to image or video recognition or understanding
- G06V2201/07—Target detection
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Evolutionary Computation (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Software Systems (AREA)
- Computing Systems (AREA)
- Artificial Intelligence (AREA)
- Health & Medical Sciences (AREA)
- Databases & Information Systems (AREA)
- Medical Informatics (AREA)
- Multimedia (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Molecular Biology (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Image Analysis (AREA)
- Facsimile Image Signal Circuits (AREA)
Abstract
The invention discloses an image data processing method of multi-scale channel attention, which is characterized in that global features and local features in input data are extracted, so that a convolutional neural network is more concerned about the whole information and local detail features of the input data, and the problems of target aggregation and target shielding in complex scenes are relieved.
Description
Technical Field
The invention relates to the field of computer vision, in particular to an image data processing method of multi-scale channel attention.
Background
The channel attention mechanism can remarkably improve the expressive force and generalization capability of the model, has lower calculation cost and is easy to integrate into the existing convolutional neural network structure. Because of these advantages, the channel attention mechanism has also been widely used in the field of deep learning applications such as image classification, object detection, semantic segmentation, etc.
The essence of the channel attention mechanism is to weight average the characteristics of different channels, so that richer, stable and reliable characteristic expression is obtained.
Existing channel attentions include SE, ECA, CA, etc., which focus on only detail information in a certain local feature or semantic information in a global feature, but not both detail information and semantic information, resulting in feature expression of insufficient channel dimensions.
Disclosure of Invention
The invention aims to provide an image data processing method of multi-scale channel attention.
The invention aims to solve the problems that:
the image data processing method for the attention of the multi-scale channel is provided, global features and local features in input data are extracted, so that the convolutional neural network is more concerned about the whole information and local detail features of the input data, and the problems of target aggregation and target shielding in a complex scene are relieved.
The image data processing method of the multi-scale channel attention adopts the following technical scheme:
image data processing method for attention of multi-scale channel
S21: the method comprises the steps of performing digital processing on input data (an original image or a feature map), converting extracted features into digital, storing the digital data through tensor matrixes, and accelerating the convergence of a convolutional neural network through normalization processing;
s22: the method of combining the global channel attention mechanism and the local channel attention mechanism is used for carrying out feature extraction and feature fusion on input data;
s23: the global channel attention mechanism uses global average pooling, self-adaptively selects a one-dimensional convolution layer with convolution kernel size and a Sigmoid activation function, and the global channel attention can be used for self-adaptively adjusting the weights of different channels through global average pooling and element-by-element conversion of a feature map, so that a model can pay attention to more important features, and the classification performance and the robustness of the model are improved, wherein the calculation formula of the global average pooling is as follows:wherein->Representing global average pooling result,/->For an input image, the dimensions thereof are w×h×c, W, H and C represent the width, height and channel of the input image, respectively, and i and j represent pixel positions on the width and height, respectively;
the calculation formula of the self-adaptive selection is as follows:wherein->Convolution kernel size, representing one-dimensional convolution, +.>Indicates the number of channels>Meaning that k can only be odd,/or->And->For changing->And->The ratio of (in the present invention>And->Taking 2 and 1 respectively;
the Sigmoid activation function is also called an S-shaped growth curve, and the calculation formula is:whereinIs input;
s24: the multi-layer perceptron MLP realized by two-dimensional convolution is adopted in a local channel attention mechanism and is used for extracting local features, the MLP architecture is activated by two-dimensional convolutions with convolution kernel size of 1 and a middle ReLU function, the number of channels of input data is only changed after the input data is subjected to the two-dimensional convolutions, the number of output channels of the first convolution operation is one sixteenth of the number of input channels, the number of output channels of the second convolution operation is consistent with the number of channels of an embedded position, and the local channel attention can help a model to better capture local information in the input features;
s25: the ReLU function only retains positive elements and discards all negative elements by setting the corresponding activity value to 0;
s26: the output of the global attention and the output of the global attention are fused, the Sigmoid function is used for activating data to obtain final attention weight, and then the activated data and the input data are multiplied pixel by pixel;
s27: compressing existing data according to the range thereof by a Sigmoid function, and compressing any input to a certain value in a section (0, 1) so as to ensure normalization;
s28: the pixel-by-pixel multiplication of the input data with the activated data is performed to perform different location weighting operations on the input data, thereby focusing more on global features and local features.
Further, the input data is subjected to two-dimensional convolution in the step S24, and then only the number of channels is changed, and the attention among the channels is estimated in a manner of shrinking before expanding the channels of the input data in the whole MLP architecture, wherein the shrinkage coefficient is r, the feature scale after shrinkage is h×w×c/r, and the feature scale after expansion is h×w×c by using a ReLU activation function.
Further, in the steps S23 and S24, the global features and the local features in the input data are respectively extracted by the global average pooling manner in the global channel attention mechanism and the multi-layer perceptron MLP manner in the local channel attention mechanism, and the fusion operation is performed on the output of the global channel attention mechanism and the output of the local channel attention mechanism in the steps S23 and S24, namely, the feature fusion is performed on different features, through the step S26, so that the convolutional neural network focuses on the whole information and the local detail features of the input data, and the problems of target aggregation and target shielding in complex scenes are relieved.
The invention has the beneficial effects that: the problems of low detection precision, high omission rate and the like caused by the characteristics of large aggregation, serious shielding and the like of small target detection in a complex scene can be further alleviated by a multi-scale channel attention image data processing method, and the multi-scale channel attention image data processing method is used for extracting global features and local features in data and carrying out feature fusion on different features, so that a convolutional neural network is more concerned about the whole information and local detail features of input data, and the problems of target aggregation and target shielding in the complex scene are alleviated.
Drawings
FIG. 1 is a schematic diagram of a method for processing image data of attention of a multi-scale channel according to the present invention;
FIG. 2 is a linear diagram of ReLU function correction according to the invention;
FIG. 3 is a schematic diagram of the normalization of sigmoid function data in the present invention.
Detailed Description
The invention will be further clarified and fully described in connection with the accompanying drawings, to which the scope of protection of the invention is not limited.
Examples
As shown in fig. 1 to 3, a method for processing image data of attention of a multi-scale channel includes the steps of:
s21: the method comprises the steps of performing digital processing on input data (an original image or a feature map), converting extracted features into digital, storing the digital data through tensor matrixes, and accelerating the convergence of a convolutional neural network through normalization processing;
s22: using a method of combining a global channel attention mechanism and a local channel attention mechanism, as shown in fig. 1, performing feature extraction and feature fusion on input data;
s23: the global channel attention mechanism uses global average pooling, self-adaptively selects a one-dimensional convolution layer with convolution kernel size and a Sigmoid activation function, as shown in the left column of fig. 1, the global channel attention can be obtained by carrying out global average pooling and element-by-element transformation on a feature map, and the weights of different channels are self-adaptively adjusted, so that a model can pay attention to more important features, and the classification performance and the robustness of the model are improved, wherein the calculation formula of the global average pooling is as follows:wherein->Representing global average pooling result,/->For an input image, the dimensions thereof are w×h×c, W, H and C represent the width, height and channel of the input image, respectively, and i and j represent pixel positions on the width and height, respectively;
the calculation formula of the self-adaptive selection is as follows:wherein->Convolution kernel size, representing one-dimensional convolution, +.>Indicates the number of channels>Meaning that k can only be odd,/or->And->For changing->And->The ratio of (in the present invention>And->Taking 2 and 1 respectively;
the Sigmoid activation function is also called an S-shaped growth curve, and as shown in fig. 3, the calculation formula is:wherein->Is input;
s24: the multi-layer perceptron MLP realized by two-dimensional convolution is adopted in a local channel attention mechanism and is used for extracting local features, the MLP architecture is formed by two-dimensional convolutions with convolution kernel size of 1 and intermediate ReLU function activation, the ReLU function activation enables the output of a part of neurons to be 0, the interdependence relation of parameters is reduced, the occurrence of over-fitting problem is relieved, the channel number of input data is only changed after two-dimensional convolution, the output channel number of a first convolution operation is one sixteenth of the input channel number, the output channel number of a second convolution operation is consistent with the embedding position channel number, and the local channel attention can help a model to better capture local information in the input features, as shown in the right column of fig. 1;
s25: the ReLU function retains only positive elements and discards all negative elements by setting the corresponding activity value to 0, as shown in fig. 2;
s26: the output of the global attention and the output of the global attention are fused, the Sigmoid function is used for activating data to obtain final attention weight, and then the activated data and the input data are multiplied pixel by pixel;
s27: compression is performed through a Sigmoid function, which compresses the existing data to a certain value in the interval (0, 1) according to the range of the existing data so as to ensure normalization, as shown in fig. 1;
s28: the pixel-wise multiplication of the input data with the activated data is performed to perform different position weighting operations on the input data, thereby focusing more on global and local features as shown in fig. 1.
The input data is subjected to two-dimensional convolution in the step S24, and then only the number of channels is changed, and the attention among the channels of the input data is estimated in a manner of shrinking before expanding in the whole MLP architecture, wherein the shrinkage coefficient is r, the feature scale after shrinkage is h×w×c/r, and the feature scale after expansion is h×w×c by using a ReLU activation function.
The steps S23 and S24 respectively extract global features and local features in input data in a global average pooling mode in a global channel attention mechanism and a multi-layer perceptron MLP mode in a local channel attention mechanism, and the step S26 is used for carrying out fusion operation on the output of the global channel attention mechanism and the output of the local channel attention mechanism of the steps S23 and S24, namely carrying out feature fusion on different features, so that a convolutional neural network focuses on the whole information and local detail features of the input data, and the problems of target aggregation and target shielding in complex scenes are relieved.
The embodiments of the present invention are disclosed as preferred embodiments, but not limited thereto, and those skilled in the art will readily appreciate from the foregoing description that various extensions and modifications can be made without departing from the spirit of the present invention.
Claims (3)
1. A method of processing image data of a multi-scale channel attention, comprising the steps of:
s21: the method comprises the steps of performing digital processing on input data, namely an original image or a feature map, converting the extracted features into digital data, storing the digital data through tensor matrixes, and accelerating the convergence of a convolutional neural network through normalization processing;
s22: the method of combining the global channel attention mechanism and the local channel attention mechanism is used for carrying out feature extraction and feature fusion on input data;
s23: using global average pooling, a one-dimensional convolution layer with self-adaptive selection convolution kernel size and a Sigmoid activation function in a global channel attention mechanism, wherein the calculation formula of the global average pooling process is as follows:wherein->Representing global average pooling result,/->For an input image, the dimensions thereof are w×h×c, W, H and C represent the width, height and channel of the input image, respectively, and i and j represent pixel positions on the width and height, respectively;
the calculation formula of the self-adaptive selection is as follows:wherein->Convolution kernel size, representing one-dimensional convolution, +.>Indicates the number of channels>Meaning that k can only be odd,/or->And->For changing->And->The ratio between;
the Sigmoid activation function is also called an S-shaped growth curve, and the calculation formula is:wherein->Is input;
s24: the multi-layer perceptron MLP is realized by two-dimensional convolution in a local channel attention mechanism and is used for extracting local features, the MLP architecture is activated by two-dimensional convolutions with convolution kernel size of 1 and a middle ReLU function, the number of channels of input data is only changed after the input data is subjected to the two-dimensional convolutions, the number of output channels of a first convolution operation is one sixteenth of the number of input channels, and the number of output channels of a second convolution operation is consistent with the number of channels of an embedded position;
s25: the ReLU function only retains positive elements and discards all negative elements by setting the corresponding activity value to 0;
s26: the output of the global attention and the output of the global attention are fused, the Sigmoid function is used for activating data to obtain final attention weight, and then the activated data and the input data are multiplied pixel by pixel;
s27: compressing existing data according to the range thereof by a Sigmoid function, and compressing any input to a certain value in a section (0, 1) so as to ensure normalization;
s28: the pixel-by-pixel multiplication of the input data with the activated data is performed to perform different location weighting operations on the input data, thereby focusing more on global features and local features.
2. A method of processing image data for multi-scale channel attention as recited in claim 1, wherein,
the steps S23 and S24 respectively extract global features and local features in input data in a global average pooling mode in a global channel attention mechanism and a multi-layer perceptron MLP mode in a local channel attention mechanism, and the step S26 is used for carrying out fusion operation on the output of the global channel attention mechanism and the output of the local channel attention mechanism of the steps S23 and S24, namely carrying out feature fusion on different features, so that a convolutional neural network focuses on the whole information and local detail features of the input data, and the problems of target aggregation and target shielding in complex scenes are relieved.
3. The method according to claim 1, wherein the input data is subjected to two-dimensional convolution in step S24 to change only the number of channels, and the attention between channels is estimated for the channels of the input data in a first-shrink-then-expansion manner in the whole MLP architecture, wherein the shrinkage factor is r, the feature scale after shrinkage is hxw x C/r, and the feature scale after expansion is hxw x C using a ReLU activation function.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310414590.1A CN116129207B (en) | 2023-04-18 | 2023-04-18 | Image data processing method for attention of multi-scale channel |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310414590.1A CN116129207B (en) | 2023-04-18 | 2023-04-18 | Image data processing method for attention of multi-scale channel |
Publications (2)
Publication Number | Publication Date |
---|---|
CN116129207A true CN116129207A (en) | 2023-05-16 |
CN116129207B CN116129207B (en) | 2023-08-04 |
Family
ID=86301329
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310414590.1A Active CN116129207B (en) | 2023-04-18 | 2023-04-18 | Image data processing method for attention of multi-scale channel |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116129207B (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118094343A (en) * | 2024-04-23 | 2024-05-28 | 安徽大学 | Attention mechanism-based LSTM machine residual service life prediction method |
CN118397281A (en) * | 2024-06-24 | 2024-07-26 | 湖南工商大学 | Image segmentation model training method, segmentation method and device based on artificial intelligence |
Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20180231871A1 (en) * | 2016-06-27 | 2018-08-16 | Zhejiang Gongshang University | Depth estimation method for monocular image based on multi-scale CNN and continuous CRF |
CN110853051A (en) * | 2019-10-24 | 2020-02-28 | 北京航空航天大学 | Cerebrovascular image segmentation method based on multi-attention dense connection generation countermeasure network |
CN111489358A (en) * | 2020-03-18 | 2020-08-04 | 华中科技大学 | Three-dimensional point cloud semantic segmentation method based on deep learning |
CN112017198A (en) * | 2020-10-16 | 2020-12-01 | 湖南师范大学 | Right ventricle segmentation method and device based on self-attention mechanism multi-scale features |
CN112784764A (en) * | 2021-01-27 | 2021-05-11 | 南京邮电大学 | Expression recognition method and system based on local and global attention mechanism |
CN113627295A (en) * | 2021-07-28 | 2021-11-09 | 中汽创智科技有限公司 | Image processing method, device, equipment and storage medium |
CN114842553A (en) * | 2022-04-18 | 2022-08-02 | 安庆师范大学 | Behavior detection method based on residual shrinkage structure and non-local attention |
CN115240201A (en) * | 2022-09-21 | 2022-10-25 | 江西师范大学 | Chinese character generation method for alleviating network mode collapse problem by utilizing Chinese character skeleton information |
CN115761258A (en) * | 2022-11-10 | 2023-03-07 | 山西大学 | Image direction prediction method based on multi-scale fusion and attention mechanism |
CN115880225A (en) * | 2022-11-10 | 2023-03-31 | 北京工业大学 | Dynamic illumination human face image quality enhancement method based on multi-scale attention mechanism |
-
2023
- 2023-04-18 CN CN202310414590.1A patent/CN116129207B/en active Active
Patent Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20180231871A1 (en) * | 2016-06-27 | 2018-08-16 | Zhejiang Gongshang University | Depth estimation method for monocular image based on multi-scale CNN and continuous CRF |
CN110853051A (en) * | 2019-10-24 | 2020-02-28 | 北京航空航天大学 | Cerebrovascular image segmentation method based on multi-attention dense connection generation countermeasure network |
CN111489358A (en) * | 2020-03-18 | 2020-08-04 | 华中科技大学 | Three-dimensional point cloud semantic segmentation method based on deep learning |
CN112017198A (en) * | 2020-10-16 | 2020-12-01 | 湖南师范大学 | Right ventricle segmentation method and device based on self-attention mechanism multi-scale features |
CN112784764A (en) * | 2021-01-27 | 2021-05-11 | 南京邮电大学 | Expression recognition method and system based on local and global attention mechanism |
CN113627295A (en) * | 2021-07-28 | 2021-11-09 | 中汽创智科技有限公司 | Image processing method, device, equipment and storage medium |
CN114842553A (en) * | 2022-04-18 | 2022-08-02 | 安庆师范大学 | Behavior detection method based on residual shrinkage structure and non-local attention |
CN115240201A (en) * | 2022-09-21 | 2022-10-25 | 江西师范大学 | Chinese character generation method for alleviating network mode collapse problem by utilizing Chinese character skeleton information |
CN115761258A (en) * | 2022-11-10 | 2023-03-07 | 山西大学 | Image direction prediction method based on multi-scale fusion and attention mechanism |
CN115880225A (en) * | 2022-11-10 | 2023-03-31 | 北京工业大学 | Dynamic illumination human face image quality enhancement method based on multi-scale attention mechanism |
Non-Patent Citations (4)
Title |
---|
ABHINAV SAGAR ET.AL: "DMSANet: Dual Multi Scale Attention Network", 《ARXIV:2106.08382V2 [CS.CV]》, pages 1 - 10 * |
GANG LIU ET.AL: "Multiple Dirac Points and Hydrogenation-Induced Magnetism of Germanene Layer on Al (111) Surface", 《 JOURNAL OF PHYSICAL CHEMISTRY LETTE》, pages 4936 - 4942 * |
章予希: "基于多尺度特征联合注意力的声纹识别模型研究", 《中国优秀硕士学位论文全文数据库 信息科技辑》, pages 136 - 362 * |
高丹;陈建英;谢盈;: "A-PSPNet:一种融合注意力机制的PSPNet图像语义分割模型", 中国电子科学研究院学报, no. 06, pages 28 - 33 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118094343A (en) * | 2024-04-23 | 2024-05-28 | 安徽大学 | Attention mechanism-based LSTM machine residual service life prediction method |
CN118397281A (en) * | 2024-06-24 | 2024-07-26 | 湖南工商大学 | Image segmentation model training method, segmentation method and device based on artificial intelligence |
Also Published As
Publication number | Publication date |
---|---|
CN116129207B (en) | 2023-08-04 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN116129207B (en) | Image data processing method for attention of multi-scale channel | |
CN111462126B (en) | Semantic image segmentation method and system based on edge enhancement | |
US20220230282A1 (en) | Image processing method, image processing apparatus, electronic device and computer-readable storage medium | |
CN113011329B (en) | Multi-scale feature pyramid network-based and dense crowd counting method | |
CN111652081B (en) | Video semantic segmentation method based on optical flow feature fusion | |
CN112232134B (en) | Human body posture estimation method based on hourglass network and attention mechanism | |
CN114387521B (en) | Remote sensing image building extraction method based on attention mechanism and boundary loss | |
CN110569851A (en) | real-time semantic segmentation method for gated multi-layer fusion | |
CN113361493B (en) | Facial expression recognition method robust to different image resolutions | |
CN115620118A (en) | Saliency target detection method based on multi-scale expansion convolutional neural network | |
Wang et al. | TF-SOD: a novel transformer framework for salient object detection | |
Mo et al. | PVDet: Towards pedestrian and vehicle detection on gigapixel-level images | |
CN113327254A (en) | Image segmentation method and system based on U-type network | |
CN115984747A (en) | Video saliency target detection method based on dynamic filter | |
CN112581423A (en) | Neural network-based rapid detection method for automobile surface defects | |
CN112950505B (en) | Image processing method, system and medium based on generation countermeasure network | |
CN111488839B (en) | Target detection method and target detection system | |
CN117557779A (en) | YOLO-based multi-scale target detection method | |
CN111402140A (en) | Single image super-resolution reconstruction system and method | |
CN116597144A (en) | Image semantic segmentation method based on event camera | |
CN116246109A (en) | Multi-scale hole neighborhood attention computing backbone network model and application thereof | |
CN113810597A (en) | Rapid image and scene rendering method based on semi-prediction filtering | |
CN113222016A (en) | Change detection method and device based on cross enhancement of high-level and low-level features | |
CN108629737B (en) | Method for improving JPEG format image space resolution | |
CN118397192B (en) | Point cloud analysis method based on double-geometry learning and self-adaptive sparse attention |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |