CN114037899A - VIT-based hyperspectral remote sensing image-oriented classification radial accumulation position coding system - Google Patents
VIT-based hyperspectral remote sensing image-oriented classification radial accumulation position coding system Download PDFInfo
- Publication number
- CN114037899A CN114037899A CN202111453939.XA CN202111453939A CN114037899A CN 114037899 A CN114037899 A CN 114037899A CN 202111453939 A CN202111453939 A CN 202111453939A CN 114037899 A CN114037899 A CN 114037899A
- Authority
- CN
- China
- Prior art keywords
- data
- vit
- model
- position coding
- block
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000009825 accumulation Methods 0.000 title claims abstract description 18
- 239000013598 vector Substances 0.000 claims abstract description 20
- 238000000513 principal component analysis Methods 0.000 claims abstract description 19
- 238000007781 pre-processing Methods 0.000 claims abstract description 11
- 238000012545 processing Methods 0.000 claims abstract description 7
- 238000013075 data extraction Methods 0.000 claims abstract description 4
- 238000000034 method Methods 0.000 claims description 17
- 230000000694 effects Effects 0.000 claims description 4
- 238000010586 diagram Methods 0.000 description 3
- 238000013135 deep learning Methods 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 238000003058 natural language processing Methods 0.000 description 2
- 238000012847 principal component analysis method Methods 0.000 description 2
- 230000003595 spectral effect Effects 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 238000003062 neural network model Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/213—Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
- G06F18/2135—Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods based on approximation criteria, e.g. principal component analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
Landscapes
- Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Theoretical Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Artificial Intelligence (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Image Analysis (AREA)
Abstract
The invention relates to a VIT-based hyperspectral remote sensing image-oriented radial accumulation position coding system, which comprises: the data extraction preprocessing module is used for intercepting data from the data set, preprocessing the data through Principal Component Analysis (PCA) and inputting the processed data into the model frame; the position coding module is used for superposing input data from a central point to the periphery so that the data of each point is attached with position information; the data splitting module divides data into a plurality of block-shaped areas, stretches the block-shaped areas into one-dimensional vectors respectively, and inputs the vectors into the ViT model; and the main body model module comprises an ViT model and is used for processing input data and outputting a classification result. The invention adds more effective spatial information to the data input to ViT, thereby improving the classification accuracy of pixel level.
Description
Technical Field
The invention relates to the field of remote sensing, in particular to a VIT-based hyperspectral remote sensing image-oriented radial accumulation position coding system.
Background
The hyperspectral image classification aims at distributing each pixel to a certain class, the research work of the technology has attracted great attention in academia and industry, and the hyperspectral image classification has significant application value in the aspects of land coverage detection, city planning and traffic monitoring. The hyperspectral image has the characteristics of high spectral feature dimensionality and low spatial resolution, and the great challenge is provided for fully utilizing the spectral information contained in each pixel and the spatial information contained in the pixels around each pixel. The current hyperspectral image classification work mainly adopts a deep learning method, compared with the traditional machine learning method, the precision is obviously improved, but the multilayer convolution structure also causes a great increase of the calculated amount. Vision Transformer (ViT) is a neural network model based on a self-attention mechanism which is created in recent years, and global context information is captured in an attention mode so as to establish long-distance dependence on a target, so that the characteristic capability of the feature is improved. Under the condition that the receptive field is the same, the calculation amount is greatly reduced compared with the deep learning method.
For the ViT model, the position coding method determines the utilization effect of the model on data space information, the commonly used position coding method is trigonometric function position coding, the method is a commonly used position coding method in Natural Language Processing (NLP), the method is well represented in one-dimensional data, but the two-dimensional space relationship between pixel points cannot be well expressed, and the method is not suitable for high-dimensional hyperspectral remote sensing images.
Disclosure of Invention
In view of this, the present invention provides a VIT-based radial accumulation position coding system for hyperspectral remote sensing image classification, so that more effective spatial information can be added to data input to ViT, thereby improving the pixel-level classification accuracy.
In order to achieve the purpose, the invention adopts the following technical scheme:
a VIT-based hyperspectral remote sensing image-oriented radial accumulation position coding system comprises:
the data extraction preprocessing module is used for intercepting data from the data set, preprocessing the data through Principal Component Analysis (PCA) and inputting the processed data into the model frame;
the position coding module is used for superposing input data from a central point to the periphery so that the data of each point is attached with position information;
the data splitting module divides data into a plurality of block-shaped areas, stretches the block-shaped areas into one-dimensional vectors respectively, and inputs the vectors into the ViT model;
and the main body model module comprises an ViT model and is used for processing input data and outputting a classification result.
Furthermore, the position coding module achieves the effect of distinguishing the positions of all the points similar to the traditional position coding by accumulating the data from the central point to the periphery in the radial direction, and enables the model to have absolute rotation invariance when the size of the block-shaped area is divided into 1 x 1.
Furthermore, when the input data is preprocessed by the PCA principal component analysis method, the dimensionality of the data is not changed.
Further, the data splitting module divides the input data into N block-like regions according to the size of 1 × 1 or 3 × 3, stretches the N block-like regions into one-dimensional vectors, and inputs the vectors into the ViT model.
A coding method for a hyperspectral remote sensing image classification-oriented radial accumulation position coding system based on VIT comprises the following steps:
step S1: intercepting data from the data set according to a window size, wherein the category of the central point is set as a classification label; preprocessing by PCA principal component analysis, and inputting the processed data into a model frame;
step S2: processing the input data by using the radial accumulation position coding method provided by the text, and overlapping the input data from a central point to the periphery to enable the data of each point to be accompanied with position information;
step S3: dividing input data into N block-shaped areas according to the size of 1 × 1 or 3 × 3, stretching the N block-shaped areas into one-dimensional vectors respectively, and inputting the vectors into an ViT model;
step S4: the data is input into ViT model to get classification result.
Compared with the prior art, the invention has the following beneficial effects:
the invention adds more effective spatial information to the data input to ViT, thereby improving the classification accuracy of pixel level.
Drawings
FIG. 1 is a schematic diagram of the system architecture of the present invention;
FIG. 2 is a schematic diagram of radial accumulation position encoding according to an embodiment of the present invention;
fig. 3 is a schematic diagram of an overlay method according to an embodiment of the invention.
Detailed Description
The invention is further explained below with reference to the drawings and the embodiments.
Referring to fig. 1, the present invention provides a VIT-based radial accumulation position coding system for hyperspectral remote sensing image classification, which includes:
the data extraction preprocessing module is used for intercepting data from the data set, preprocessing the data through Principal Component Analysis (PCA) and inputting the processed data into the model frame;
the position encoding module, instead of the trigonometric function position encoding commonly used in the ViT model, superimposes the input data from the center point to the periphery, as shown in fig. 2, so that the data of each point is accompanied by position information. The superposition method comprises the three conditions as shown in figure 3, which respectively correspond to the current point on the diagonal line relative to the central point, between the diagonal line and the vertical bisector and on the vertical bisector;
the data splitting module divides data into a plurality of block-shaped areas, stretches the block-shaped areas into one-dimensional vectors respectively, and inputs the vectors into the ViT model;
and the main body model module comprises an ViT model and is used for processing input data and outputting a classification result.
Preferably, in this embodiment, the position coding module achieves an effect of distinguishing positions of each point similar to the conventional position coding by accumulating data from a central point radially to the periphery, and makes the model have absolute rotation invariance when dividing the block-shaped region into 1 × 1.
Preferably, in this embodiment, when the input data is preprocessed by the PCA principal component analysis method, the dimensionality of the data is not changed.
Preferably, in this embodiment, the data splitting module divides the input data into N block-like regions according to the size 1 × 1 or 3 × 3, stretches the N block-like regions into one-dimensional vectors, and inputs the vectors into the ViT model.
A coding method for a hyperspectral remote sensing image classification-oriented radial accumulation position coding system based on VIT comprises the following steps:
step S1: intercepting data from the data set according to a window size, wherein the category of the central point is set as a classification label; preprocessing by PCA principal component analysis, and inputting the processed data into a model frame;
step S2: processing the input data by using the radial accumulation position coding method provided by the text, and overlapping the input data from a central point to the periphery to enable the data of each point to be accompanied with position information;
step S3: dividing input data into N block-shaped areas according to the size of 1 × 1 or 3 × 3, stretching the N block-shaped areas into one-dimensional vectors respectively, and inputting the vectors into an ViT model;
step S4: the data is input into ViT model to get classification result.
The above description is only a preferred embodiment of the present invention, and all equivalent changes and modifications made in accordance with the claims of the present invention should be covered by the present invention.
Claims (5)
1. A VIT-based hyperspectral remote sensing image-oriented radial accumulation position coding system is characterized by comprising:
the data extraction preprocessing module is used for intercepting data from the data set, preprocessing the data through Principal Component Analysis (PCA) and inputting the processed data into the model frame;
the position coding module is used for superposing input data from a central point to the periphery so that the data of each point is attached with position information;
the data splitting module divides data into a plurality of block-shaped areas, stretches the block-shaped areas into one-dimensional vectors respectively, and inputs the vectors into the ViT model;
and the main body model module comprises an ViT model and is used for processing input data and outputting a classification result.
2. The VIT-based hyperspectral remote sensing image-oriented radial accumulation position coding system according to claim 1, wherein the position coding module achieves an effect of distinguishing positions of each point similar to a traditional position coding by accumulating data from a central point to the periphery in a radial direction, and enables a model to have absolute rotation invariance when a block area is divided into 1 x 1.
3. The VIT-based hyperspectral remote sensing image-oriented radial accumulation position coding system according to claim 1, wherein the input data is preprocessed by a PCA Principal Component Analysis (PCA) method without changing the dimensionality of the data.
4. The VIT-based hyperspectral remote sensing image-oriented radial accumulation position coding system according to claim 1, wherein the data splitting module divides input data into N block-shaped areas according to the size of 1 x 1 or 3 x 3, stretches the N block-shaped areas into one-dimensional vectors respectively, and inputs the vectors into the ViT model.
5. A coding method for a hyperspectral remote sensing image classification-oriented radial accumulation position coding system based on VIT is characterized by comprising the following steps:
step S1: intercepting data from the data set according to a window size, wherein the category of the central point is set as a classification label; preprocessing by PCA principal component analysis, and inputting the processed data into a model frame;
step S2: processing the input data by using the radial accumulation position coding method provided by the text, and overlapping the input data from a central point to the periphery to enable the data of each point to be accompanied with position information;
step S3: dividing input data into N block-shaped areas according to the size of 1 × 1 or 3 × 3, stretching the N block-shaped areas into one-dimensional vectors respectively, and inputting the vectors into an ViT model;
step S4: the data is input into ViT model to get classification result.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111453939.XA CN114037899A (en) | 2021-12-01 | 2021-12-01 | VIT-based hyperspectral remote sensing image-oriented classification radial accumulation position coding system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111453939.XA CN114037899A (en) | 2021-12-01 | 2021-12-01 | VIT-based hyperspectral remote sensing image-oriented classification radial accumulation position coding system |
Publications (1)
Publication Number | Publication Date |
---|---|
CN114037899A true CN114037899A (en) | 2022-02-11 |
Family
ID=80139551
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111453939.XA Pending CN114037899A (en) | 2021-12-01 | 2021-12-01 | VIT-based hyperspectral remote sensing image-oriented classification radial accumulation position coding system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114037899A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114842253A (en) * | 2022-05-04 | 2022-08-02 | 哈尔滨理工大学 | Hyperspectral image classification method based on self-adaptive spectrum space kernel combination ViT |
CN114998653A (en) * | 2022-05-24 | 2022-09-02 | 电子科技大学 | ViT network-based small sample remote sensing image classification method, medium and equipment |
CN115661688A (en) * | 2022-10-09 | 2023-01-31 | 武汉大学 | Unmanned aerial vehicle target re-identification method, system and equipment with rotation invariance |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108171122A (en) * | 2017-12-11 | 2018-06-15 | 南京理工大学 | The sorting technique of high-spectrum remote sensing based on full convolutional network |
CN110992245A (en) * | 2019-11-18 | 2020-04-10 | 沈阳航空航天大学 | Hyperspectral image dimension reduction method and device |
CN113344070A (en) * | 2021-06-01 | 2021-09-03 | 南京林业大学 | Remote sensing image classification system and method based on multi-head self-attention module |
CN113688813A (en) * | 2021-10-27 | 2021-11-23 | 长沙理工大学 | Multi-scale feature fusion remote sensing image segmentation method, device, equipment and storage |
-
2021
- 2021-12-01 CN CN202111453939.XA patent/CN114037899A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108171122A (en) * | 2017-12-11 | 2018-06-15 | 南京理工大学 | The sorting technique of high-spectrum remote sensing based on full convolutional network |
CN110992245A (en) * | 2019-11-18 | 2020-04-10 | 沈阳航空航天大学 | Hyperspectral image dimension reduction method and device |
CN113344070A (en) * | 2021-06-01 | 2021-09-03 | 南京林业大学 | Remote sensing image classification system and method based on multi-head self-attention module |
CN113688813A (en) * | 2021-10-27 | 2021-11-23 | 长沙理工大学 | Multi-scale feature fusion remote sensing image segmentation method, device, equipment and storage |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114842253A (en) * | 2022-05-04 | 2022-08-02 | 哈尔滨理工大学 | Hyperspectral image classification method based on self-adaptive spectrum space kernel combination ViT |
CN114998653A (en) * | 2022-05-24 | 2022-09-02 | 电子科技大学 | ViT network-based small sample remote sensing image classification method, medium and equipment |
CN114998653B (en) * | 2022-05-24 | 2024-04-26 | 电子科技大学 | ViT network-based small sample remote sensing image classification method, medium and equipment |
CN115661688A (en) * | 2022-10-09 | 2023-01-31 | 武汉大学 | Unmanned aerial vehicle target re-identification method, system and equipment with rotation invariance |
CN115661688B (en) * | 2022-10-09 | 2024-04-26 | 武汉大学 | Unmanned aerial vehicle target re-identification method, system and equipment with rotation invariance |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN114037899A (en) | VIT-based hyperspectral remote sensing image-oriented classification radial accumulation position coding system | |
CN108491835B (en) | Two-channel convolutional neural network for facial expression recognition | |
Buslaev et al. | Fully convolutional network for automatic road extraction from satellite imagery | |
CN110728200B (en) | Real-time pedestrian detection method and system based on deep learning | |
CN102663377B (en) | Character recognition method based on template matching | |
CN113158768B (en) | Intelligent vehicle lane line detection method based on ResNeSt and self-attention distillation | |
CN111460921A (en) | Lane line detection method based on multitask semantic segmentation | |
Shivakumara et al. | Fractals based multi-oriented text detection system for recognition in mobile video images | |
CN109886159B (en) | Face detection method under non-limited condition | |
CN102289681A (en) | Method for matching envelope images | |
CN110866900A (en) | Water body color identification method and device | |
Fleyeh | Traffic and road sign recognition | |
Hong et al. | Automatic recognition of flowers through color and edge based contour detection | |
CN111079675A (en) | Driving behavior analysis method based on target detection and target tracking | |
Peng et al. | GET: group event transformer for event-based vision | |
CN111767854A (en) | SLAM loop detection method combined with scene text semantic information | |
CN116030036A (en) | Image difference detection method, model training method, system, equipment and medium | |
CN115546671A (en) | Unmanned aerial vehicle change detection method and system based on multitask learning | |
CN110321803A (en) | A kind of traffic sign recognition method based on SRCNN | |
Shen et al. | Lane line detection and recognition based on dynamic ROI and modified firefly algorithm | |
Khin et al. | License plate detection of Myanmar vehicle images captured from the dissimilar environmental conditions | |
Hao et al. | RRL: Regional rotate layer in convolutional neural networks | |
CN116993985A (en) | Method for realizing Zero-Shot automatic cutting of safety belt based on CLIP | |
Salunkhe et al. | Recognition of multilingual text from signage boards | |
CN111402223B (en) | Transformer substation defect problem detection method using transformer substation video image |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |