CN117372767A - Hyperspectral image tree classification method, device and storage medium - Google Patents
Hyperspectral image tree classification method, device and storage medium Download PDFInfo
- Publication number
- CN117372767A CN117372767A CN202311366203.8A CN202311366203A CN117372767A CN 117372767 A CN117372767 A CN 117372767A CN 202311366203 A CN202311366203 A CN 202311366203A CN 117372767 A CN117372767 A CN 117372767A
- Authority
- CN
- China
- Prior art keywords
- image
- features
- hyperspectral image
- attention
- hyperspectral
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 41
- 230000011218 segmentation Effects 0.000 claims abstract description 24
- 230000003595 spectral effect Effects 0.000 claims abstract description 20
- 238000001228 spectrum Methods 0.000 claims abstract description 14
- 239000011159 matrix material Substances 0.000 claims description 23
- 239000013598 vector Substances 0.000 claims description 19
- 238000000513 principal component analysis Methods 0.000 claims description 9
- 230000009467 reduction Effects 0.000 claims description 7
- 230000008859 change Effects 0.000 claims description 6
- 238000001914 filtration Methods 0.000 claims description 6
- 238000010606 normalization Methods 0.000 claims description 6
- 238000005070 sampling Methods 0.000 claims description 6
- 238000012549 training Methods 0.000 claims description 4
- 238000004590 computer program Methods 0.000 claims description 3
- 238000000605 extraction Methods 0.000 claims description 3
- 238000007781 pre-processing Methods 0.000 claims description 3
- 238000012545 processing Methods 0.000 claims description 3
- 230000017105 transposition Effects 0.000 claims description 3
- 230000009977 dual effect Effects 0.000 claims 1
- 238000013507 mapping Methods 0.000 claims 1
- 238000009826 distribution Methods 0.000 abstract description 2
- 238000012544 monitoring process Methods 0.000 abstract description 2
- 230000006870 function Effects 0.000 description 9
- 230000009466 transformation Effects 0.000 description 6
- 230000004927 fusion Effects 0.000 description 5
- 238000004422 calculation algorithm Methods 0.000 description 4
- 238000001514 detection method Methods 0.000 description 4
- 230000000694 effects Effects 0.000 description 3
- 238000010801 machine learning Methods 0.000 description 3
- 230000008569 process Effects 0.000 description 3
- 238000004364 calculation method Methods 0.000 description 2
- 238000013135 deep learning Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 238000007635 classification algorithm Methods 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000013144 data compression Methods 0.000 description 1
- 238000000354 decomposition reaction Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 239000003550 marker Substances 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 238000007619 statistical method Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/764—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
- G06N3/0455—Auto-encoder networks; Encoder-decoder networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0464—Convolutional networks [CNN, ConvNet]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/047—Probabilistic or stochastic networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/26—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
- G06V10/267—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/54—Extraction of image or video features relating to texture
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/58—Extraction of image or video features relating to hyperspectral data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/7715—Feature extraction, e.g. by transforming the feature space, e.g. multi-dimensional scaling [MDS]; Mappings, e.g. subspace methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/10—Terrestrial scenes
- G06V20/188—Vegetation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/10—Terrestrial scenes
- G06V20/194—Terrestrial scenes using hyperspectral data, i.e. more or other wavelengths than RGB
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02A—TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
- Y02A40/00—Adaptation technologies in agriculture, forestry, livestock or agroalimentary production
- Y02A40/10—Adaptation technologies in agriculture, forestry, livestock or agroalimentary production in agriculture
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Multimedia (AREA)
- Artificial Intelligence (AREA)
- Software Systems (AREA)
- Computing Systems (AREA)
- Data Mining & Analysis (AREA)
- Molecular Biology (AREA)
- Mathematical Physics (AREA)
- General Engineering & Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Databases & Information Systems (AREA)
- Medical Informatics (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Remote Sensing (AREA)
- Probability & Statistics with Applications (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a hyperspectral image tree classification method, a hyperspectral image tree classification device and a storage medium, which comprise the following steps: after multi-scale segmentation is carried out on the hyperspectral image, inputting a pre-trained CNN model for leveling treatment, and obtaining a merging layer of the hyperspectral image; extracting texture and spectral features from the hyperspectral image; texture features and spectral features are input into image features with attention as input of a transducer model, and position codes are added to the input section to obtain image features with relative position information. The output features are then sent to a decoder to obtain a classification result. Various tree types of distribution can be obtained, tree classification is realized, and monitoring and supervision are facilitated. Meanwhile, due to the similarity of the spectrum information of the plants, the plant spectrum information is very difficult to identify by naked eyes, and the establishment of automatic tree species identification is particularly important.
Description
Technical Field
The invention discloses a hyperspectral image tree classification method, a hyperspectral image tree classification device and a storage medium, and relates to the field of automatic control of hydraulic engineering construction.
Background
The forest tree species fine classification based on hyperspectral remote sensing data is mainly represented on a spectrum matching classification algorithm, a spectrum characteristic and a statistical analysis method. The three most widely applied directions of fine identification of hyperspectral remote sensing forest tree species are an early traditional classification method, a multi-source remote sensing data collaboration method and a deep learning-based method.
However, the conventional machine learning classification method has a great limitation in hyperspectral images. The tree classification by using the machine learning method needs to carry out the data dimension reduction step, but the hyperspectral image contains a large number of continuous narrow bands, and cannot be fully utilized after dimension reduction treatment. Secondly, machine learning has reached the ceiling in terms of current algorithms and classification accuracy, and is difficult to optimize from other angles;
in the multisource remote sensing data collaboration method, liDAR data are not easy to acquire, airborne data coverage area is small, large-scale acquisition cannot be achieved, and satellite-borne data generally adopt a large footprint mode, so that application requirements of forest tree species fine research are difficult to meet;
deep learning based methods require a large number of training samples and require high quality of the marker data, which limits their practical application. Meanwhile, a large number of parameters may cause a problem of over fitting of the model in the training process, resulting in insufficient generalization capability of the model. Secondly, when extracting the spatial spectrum joint features, the size of the spatial neighborhood, the size, the structure and the complexity of the network and the size of the input data space need to be considered, which has great influence on the classification capability of the model.
Disclosure of Invention
Aiming at the defects in the background technology, the invention provides a hyperspectral image tree classification method, a hyperspectral image tree classification device and a storage medium, and the scheme of combining multiscale segmentation with a transducer model achieves various tree type distributions and tree classification.
In order to achieve the above purpose, the technical scheme adopted by the invention is as follows: a hyperspectral image tree classification method comprises the following steps:
acquiring a hyperspectral image;
denoising, filtering and dimension reduction preprocessing are carried out on the hyperspectral image;
carrying out multi-scale segmentation on the preprocessed hyperspectral image;
inputting the segmented hyperspectral image into a pre-trained CNN model for flattening, and obtaining a flattened hyperspectral image;
extracting the flattened hyperspectral image through a gray level co-occurrence matrix to obtain texture features;
performing independent principal component analysis on the flattened hyperspectral image to obtain spectral features of different wavebands, and selecting the spectral features of the first m wavebands with more spectral features as extracted spectral features;
inputting the extracted texture features and spectrum features into a double-head attention network model to obtain output image features with double-head attention;
taking the image features with double-head attention as input of a transducer model, and adding a position code to obtain the image features with relative position information;
and encoding the image features with the relative position information, decoding the encoded image features and the output of class embedding, up-sampling, classifying each pixel, and outputting a hyperspectral image tree classification result.
Further, the multi-scale segmentation of the preprocessed hyperspectral image specifically includes: judging the size of the change rate value of the local change of the homogeneity of the preprocessed hyperspectral image under different segmentation scale parameters, and segmenting the hyperspectral image according to the corresponding scale value when the local change rate value is the maximum.
Further, inputting the segmented hyperspectral image into a pretrained CNN model for flattening, and obtaining a flattened hyperspectral image:
the pre-trained CNN model includes: firstly, convoluting hyperspectral data with a 3D kernel, performing preliminary pre-training, performing structure extraction and comparison on true and false hyperspectral images of each input wave band through 3-layer convolution, and finally judging whether the input image is the probability of the true hyperspectral image or not through a leakage ReLU function;
the flattening method specifically comprises the following steps: the three-dimensional layer in the CNN model is converted to a one-dimensional vector using a "flattening layer".
Further, inputting the extracted texture features and spectrum features into a dual-head attention network model to obtain output image features with dual-head attention, wherein the method specifically comprises the following steps: the texture features and the spectrum features are input into a double-head attention network, three attention features are obtained through 1*1 convolution layers with three different weights, the first attention feature is transposed and multiplied by the second attention feature, and then the result is input into a Softmax function to obtain an attention map; after the obtained attention attempts are transposed and multiplied by the third attention feature matrix, the layers are convolved by 1*1 to obtain the image features with double-headed attention.
Further, the double-headed attention network model β employs the following formula:
where N is the number of image features, s i Scoring a function for attention;
s i =(W f x) T *(W g x)
wherein x is the image attention characteristic extracted by the convolution network, W f And W is g Is two weight matrixes, and is realized by 1*1 convolution; t is the transposition of the matrix;
image feature x with double-headed attention 0 Is as follows:
wherein h (x i )=W h x i ;W h And W is v Is two weight matrices; x is x i Beta is the attention network model function for the input image features.
Further, taking the image features with double-headed attention as input of a transducer model, and adding position codes to obtain the image features with relative position information, wherein the method specifically comprises the following steps of:
taking the image characteristics with double-head attention as input of a transducer model, adding position codes, and obtaining the image characteristics with relative position information specifically comprises the following steps:
wherein: x is x 0′ For image features with relative position information,to define a position vector.
Further, coding the image features with the relative position information, decoding the coded image features and the output of class embedding, up-sampling and classifying each pixel, and outputting a hyperspectral image tree classification result specifically comprises the following steps of;
wherein: b is an index matrix according to each pixel position in the relative position index matrix, softMax is the dimension normalization of the respective row vector, i.e. divided byd k Is the feature dimension. Q represents a query, K represents a key and V represents a value, W Q 、W K 、W V Respectively Q, K, V. The final output can be obtained by carrying out dot product operation on the query and the key, carrying out normalization processing to obtain the weight of each value, multiplying and summing the weight and the value.
A hyperspectral image tree classification device comprises a processor and a storage medium; the storage medium is used for storing instructions; the processor is used for operating according to the instruction to execute the steps of the hyperspectral image tree classification method.
A storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of the hyperspectral image tree species classification method described above.
The beneficial effects are that: according to the method, various tree types are distributed through the scheme of combining multi-scale segmentation with a transducer model, tree classification is achieved, and monitoring and supervision are facilitated. Meanwhile, due to the similarity of the spectrum information of the plants, the plant spectrum information is very difficult to identify by naked eyes, and the establishment of automatic tree species identification is particularly important.
Drawings
FIG. 1 is a flow chart of the present invention.
Detailed Description
The implementation of the technical solution is described in further detail below with reference to the accompanying drawings. The following examples are only for more clearly illustrating the technical aspects of the present invention, and are not intended to limit the scope of the present invention.
As shown in fig. 1, a hyperspectral image tree classification method includes:
s1, acquiring hyperspectral images;
step S2, denoising, filtering and dimension reduction pretreatment are carried out on the hyperspectral image;
s3, carrying out multi-scale segmentation on the preprocessed hyperspectral image;
s4, inputting the segmented hyperspectral image into a pre-trained CNN model for flattening, and obtaining a flattened hyperspectral image;
s5, extracting the flattened hyperspectral image through a gray level co-occurrence matrix to obtain texture features;
s6, performing independent principal component analysis on the flattened hyperspectral image to obtain spectral features of different wavebands, and selecting the spectral features of the first m wavebands with more spectral features as extracted spectral features;
step S7, inputting the extracted texture features and spectrum features into a double-head attention network model to obtain output image features with double-head attention and fusing feature layers;
step S8, taking the image features with attention as input of a transducer model, and adding position codes into an input part to obtain the image features with relative position information;
and S9, encoding the image features with the relative position information by using an encoder, decoding the encoder and the output embedded by the class by using a Mask converter, up-sampling, classifying each pixel, and outputting a hyperspectral image tree classification result.
Step S2, performing denoising, filtering and dimension reduction preprocessing on the hyperspectral image, including:
hyperspectral image denoising is the process of separating sparse noise S and gaussian noise N from a three-dimensional tensor Y to obtain a pure image X; the solution adopted in this embodiment is to add an unknown hyperspectral image and mix early growth prior information; in this framework, the most widely used denoising method is Robust Principal Component Analysis (RPCA); the robust principal component analysis can be written as the following optimization problem:
the hyperspectral data of the unmanned aerial vehicle is filtered and smoothed by using a Savitzky Golay method, and the filtering method is a filtering method based on time domain local polynomial least squares fitting, and can filter noise and simultaneously keep the shape and the width of the signal.
Principal Component Analysis (PCA) is the most basic dimension reduction method of hyperspectral data, and plays an important role in hyperspectral data compression, decorrelation, denoising and feature extraction.
In the principal component analysis transformation of hyperspectral remote sensing data, each band is usually regarded as a vector, and the spatial dimension of the image is m×nm×n, assuming that hyperspectral remote sensing data has pp bands. The specific process flow is as follows:
image vectorization: the input image data may be expressed as x= (X1, X2, …, xp) tx= (X1, X2, …, xp) T, where xix is an n×1n×1 column vector, where n=m×nn=m×n expands and connects images in rows or columns with a rule called a vector.
Vector centering: the average vector of the vector set is subtracted from all vectors in the vector set, i.e., y=x-E (X).
The covariance matrix Σ of the vector group YY is calculated.
The eigenvalue matrix ΛΛ and eigenvector matrix AA of the covariance matrix Σ are solved.
The principal component conversion is performed, z=atiyz=atiy.
Step S3, carrying out multi-scale segmentation on the preprocessed hyperspectral image, wherein the step comprises the following steps:
considering that hyperspectral images are often affected by noise and weak edges, isolated regions can be easily generated using conventional region segmentation algorithms; therefore, the present embodiment adopts an improved algorithm, a multi-scale segmentation method; in this algorithm, a segmentation threshold needs to be determined, which will have a significant impact on the segmentation result; therefore, in this embodiment, the transformation matrix of the image is obtained through the adjacent differential transformation, and the segmentation threshold is obtained through the statistical information of the transformation matrix; the specific steps of the segmentation are as follows:
(1) Applying a gaussian filter to the image I to obtain a smoothed image I' =i×k, where K is a gaussian kernel function;
(2) The maximum gradient transformation is performed on the segmented images I and I 'according to the following formula, resulting in maximum gradient transformation matrices MGT (I) and MGT (I').
MGT(I)=max(|MGT i (I)|)i=1,2,3,4
MGT(I′)=max(|MGT i (I′)|)i=1,2,3,4
(3) Obtaining a segmentation threshold based on statistical information of MGT (I) and MGT (I ') λ and λ';
(4) The original image I and the smooth image I 'are respectively subjected to region growing segmentation, and the results are respectively stored in RG and RG';
(5) Integrating the segmentation results RG and RG' to obtain a final segmentation result Map;
the matrix Map is the result of dividing the image I, in which the black part is the target area (gray value is filled with 0) and the white part is the background area (gray value is filled with 1).
The purpose of step (1) is to achieve a multi-scale representation of the image, wherein the original image I and the smoothed image I' are representations of the image I on different scales; wherein, I 'has eliminated some noise in the image I, but also weakened the edge of the original image, so its segmentation result RG' can effectively eliminate the influence of noise, but the segmentation effect at the region edge is poor; the segmentation result RG of the original hyperspectral image is greatly influenced by noise, the decomposition effect is poor, but the effect is good at the edge of the region; thus, in step (5), combining the two may combine their advantages.
S4, inputting the segmented hyperspectral image into a pre-trained CNN model for flattening, and obtaining a flattened hyperspectral image;
converting three-dimensional layers in the network into one-dimensional vectors using "flattening layers" to accommodate the input of fully connected layers for classification; for example, a 5x5x2 tensor is converted into a vector of size 50; the previous network convolution layer extracted features from the input hyperspectral image, but now classified these features at the time; classifying these functions using a transducer model requires one-dimensional input, which is why flattening of the layers is required.
In some embodiments, step S5, extracting texture features from the flattened hyperspectral image through a gray level co-occurrence matrix;
in a specific implementation, since the texture of vegetation in the hyperspectral image generally has no obvious directivity, the moving direction is an average value of 0 °, 45 °, 90 °, 135 ° 4 directions, and the gradient window sizes of 3×3,5×5, …,31×31 and the step length is 1, so as to extract the texture features.
In the embodiment, a plurality of wave bands with high definition, low interference information and obvious ground characteristic information are selected for texture analysis, so that the difference of various ground characteristic image characteristics can be fully displayed; eight texture features are extracted, including mean, variance, homogeneity, contrast, difference, entropy, second moment and correlation; the appropriate texture window size is selected to achieve the highest overall classification accuracy of the tree species.
S6, performing independent principal component analysis on the flattened hyperspectral image to obtain spectral features of different wavebands, and selecting the spectral features of the first m (m=5) wavebands with more spectral features as extracted spectral features;
step S7, inputting the texture features and the spectrum features extracted in the step S5 and the step S6 into a double-head attention network model to obtain output image features with double-head attention and fusing feature layers; comprising the following steps:
texture and spectral features are input into a dual-head attention network, three attention features are obtained through 1*1 convolution layers with three different weights, a first attention feature is transposed and multiplied by a second attention feature, and then the result is input into a Softmax function to obtain an attention map; after the obtained attention attempts are transposed and multiplied with the third attention feature matrix, the layers are convolved by 1*1 to obtain the final image feature with attention.
The attention network model beta adopts the following formula:
where N is the number of image features and the attention scoring function s i The calculation formula of (2) is as follows:
s i =(W f x) T *(W g x)
wherein x is the image attention characteristic extracted by the convolution network, W f And W is g Is two weight matrixes, and is realized by 1*1 convolution; t is the transposition of the matrix;
image feature x with attention 0 The calculation formula of (2) is as follows:
wherein h (x) i )=W h x i ;W h And W is v Is two weight matrixes, and is realized by 1*1 convolution; x is x i Beta is the attention network model function for the input image features.
Feature layer fusion uses Late fusion (Late fusion) approach: the fusion is carried out on the prediction scores, namely a plurality of models are trained, each model has a prediction score, and the results of all models are fused to obtain the final prediction result; (by combining detection results of different layers to improve detection performance, before final fusion is completed, starting detection of wine on a partially fused layer, detecting multiple layers, and finally fusing multiple detection results); the research thought selects features not to be fused, multi-scale features are respectively predicted, and then prediction results are synthesized, such as Single Shot MultiBox Detector (SSD), multi-scale CNN (MS-CNN).
Step S8, taking the image characteristic with attention as input of a transducer model, adding position codes into an input part to obtain the image characteristic with relative position information, and comprising the following steps:
wherein: x is x 0′ For image features with relative position information,to define a position vector.
After adding the position-coded information, the dimension remains the final dimension in patch ebadd.
And S9, encoding the image features with the relative position information by using an encoder, decoding the encoder and the output embedded by the class by using a Mask converter, up-sampling, classifying each pixel, and outputting a hyperspectral image tree classification result.
Wherein: b is an index matrix according to each pixel position in the relative position index matrix, softMax is the dimension normalization of the respective row vector, i.e. divided byd k Is a feature dimension, Q represents a query, K represents a key and V represents a value, W Q 、W K 、W V Respectively Q, K, V. The final output can be obtained by carrying out dot product operation on the query and the key, carrying out normalization processing to obtain the weight of each value, multiplying and summing the weight and the value.
A hyperspectral image tree classification device comprises a processor and a storage medium; the storage medium is used for storing instructions; the processor is used for operating according to the instruction to execute the steps of the hyperspectral image tree classification method.
A storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of the hyperspectral image tree species classification method described above.
The foregoing is merely a preferred embodiment of the present invention, and it should be noted that modifications and variations could be made by those skilled in the art without departing from the technical principles of the present invention, and such modifications and variations should also be regarded as being within the scope of the invention.
Claims (9)
1. The hyperspectral image tree classification method is characterized by comprising the following steps of:
acquiring a hyperspectral image;
denoising, filtering and dimension reduction preprocessing are carried out on the hyperspectral image;
carrying out multi-scale segmentation on the preprocessed hyperspectral image;
inputting the segmented hyperspectral image into a pre-trained CNN model for flattening, and obtaining a flattened hyperspectral image;
extracting the flattened hyperspectral image through a gray level co-occurrence matrix to obtain texture features;
performing independent principal component analysis on the flattened hyperspectral image to obtain spectral features of different wavebands, and selecting the spectral features of the first m wavebands with more spectral features as extracted spectral features;
inputting the extracted texture features and spectrum features into a double-head attention network model to obtain output image features with double-head attention;
taking the image features with double-head attention as input of a transducer model, and adding a position code to obtain the image features with relative position information;
and encoding the image features with the relative position information, decoding the encoded image features and the output of class embedding, up-sampling, classifying each pixel, and outputting a hyperspectral image tree classification result.
2. The hyperspectral image tree classification method as claimed in claim 1, wherein the multi-scale segmentation of the preprocessed hyperspectral image specifically comprises: judging the size of the change rate value of the local change of the homogeneity of the preprocessed hyperspectral image under different segmentation scale parameters, and segmenting the hyperspectral image according to the corresponding scale value when the local change rate value is the maximum.
3. The hyperspectral image tree classification method according to claim 1, wherein the segmented hyperspectral image is input into a pretrained CNN model for flattening, and the flattened hyperspectral image is obtained:
the pre-trained CNN model includes: firstly, convoluting hyperspectral data with a 3D kernel, performing preliminary pre-training, performing structure extraction and comparison on true and false hyperspectral images of each input wave band through 3-layer convolution, and finally judging whether the input image is the probability of the true hyperspectral image or not through a leakage ReLU function;
the flattening method specifically comprises the following steps: the three-dimensional layer in the CNN model is converted to a one-dimensional vector using a "flattening layer".
4. The hyperspectral image tree classification method according to claim 1, wherein inputting the extracted texture features and spectral features into a double-headed attention network model to obtain the output image features with double-headed attention specifically comprises: the texture features and the spectrum features are input into a double-head attention network, three attention features are obtained through 1*1 convolution layers with three different weights, the first attention feature is transposed and multiplied by the second attention feature, and then the result is input into a Softmax function to obtain an attention map; after the obtained attention attempts are transposed and multiplied by the third attention feature matrix, the layers are convolved by 1*1 to obtain the image features with double-headed attention.
5. The hyperspectral image tree classification method as claimed in claim 4 wherein the dual head attention network model β uses the following formula:
where N is the number of image features, s i Scoring a function for attention;
s i =(W f x) T *(W g x)
wherein x is the image attention characteristic extracted by the convolution network, W f And W is g Is two weight matrixes, and is realized by 1*1 convolution; t is the transposition of the matrix;
image feature x with double-headed attention 0 Is as follows:
wherein h (x i )=W h x i ;W h And W is v Is two weight matrices; x is x i Beta is the attention network model function for the input image features.
6. The method of classifying hyperspectral image tree species according to claim 5 wherein the step of taking the image features with double-headed attention as input of a transducer model and adding position codes to obtain the image features with relative position information comprises the steps of:
wherein: x is x 0′ For image features with relative position information,to define a position vector.
7. The hyperspectral image tree classification method as claimed in claim 1, wherein the image features with relative position information are encoded, the encoded image features and the output of class embedding are decoded, each pixel is classified after up-sampling, and the hyperspectral image tree classification result is output specifically comprising the following steps of;
wherein: b is an index matrix according to each pixel position in the relative position index matrix, softMax is the dimension normalization of the respective row vector, i.e. divided byd k Is a feature dimension, Q represents a query, K represents a key and V represents a value, W Q 、W K 、W V The full connection mapping corresponding to Q, K, V respectively; the final output can be obtained by carrying out dot product operation on the query and the key, carrying out normalization processing to obtain the weight of each value, multiplying and summing the weight and the value.
8. The hyperspectral image tree classification device is characterized by comprising a processor and a storage medium; the storage medium is used for storing instructions; the processor being operative according to the instructions to perform the steps of the method according to any one of claims 1 to 7.
9. A storage medium having stored thereon a computer program, which when executed by a processor performs the steps of the method according to any of claims 1 to 7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311366203.8A CN117372767A (en) | 2023-10-20 | 2023-10-20 | Hyperspectral image tree classification method, device and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311366203.8A CN117372767A (en) | 2023-10-20 | 2023-10-20 | Hyperspectral image tree classification method, device and storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN117372767A true CN117372767A (en) | 2024-01-09 |
Family
ID=89407317
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202311366203.8A Pending CN117372767A (en) | 2023-10-20 | 2023-10-20 | Hyperspectral image tree classification method, device and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117372767A (en) |
-
2023
- 2023-10-20 CN CN202311366203.8A patent/CN117372767A/en active Pending
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110321963B (en) | Hyperspectral image classification method based on fusion of multi-scale and multi-dimensional space spectrum features | |
CN110287869B (en) | High-resolution remote sensing image crop classification method based on deep learning | |
CN111259828B (en) | High-resolution remote sensing image multi-feature-based identification method | |
CN111914611B (en) | Urban green space high-resolution remote sensing monitoring method and system | |
CN111310666B (en) | High-resolution image ground feature identification and segmentation method based on texture features | |
CN110008948B (en) | Hyperspectral image target detection method based on variational self-coding network | |
CN107145836B (en) | Hyperspectral image classification method based on stacked boundary identification self-encoder | |
CN112200090B (en) | Hyperspectral image classification method based on cross-grouping space-spectral feature enhancement network | |
CN112308152B (en) | Hyperspectral image ground object classification method based on spectrum segmentation and homogeneous region detection | |
Ou et al. | A CNN framework with slow-fast band selection and feature fusion grouping for hyperspectral image change detection | |
CN112101271A (en) | Hyperspectral remote sensing image classification method and device | |
CN108229551B (en) | Hyperspectral remote sensing image classification method based on compact dictionary sparse representation | |
CN112733800B (en) | Remote sensing image road information extraction method and device based on convolutional neural network | |
CN112949416B (en) | Supervised hyperspectral multiscale graph volume integral classification method | |
CN112434571A (en) | Hyperspectral anomaly detection method based on attention self-coding network | |
CN114398948A (en) | Multispectral image change detection method based on space-spectrum combined attention network | |
CN112766223A (en) | Hyperspectral image target detection method based on sample mining and background reconstruction | |
CN115471675A (en) | Disguised object detection method based on frequency domain enhancement | |
CN117058558A (en) | Remote sensing image scene classification method based on evidence fusion multilayer depth convolution network | |
Long et al. | Dual self-attention Swin transformer for hyperspectral image super-resolution | |
CN109947960B (en) | Face multi-attribute joint estimation model construction method based on depth convolution | |
CN112446256A (en) | Vegetation type identification method based on deep ISA data fusion | |
CN112381144B (en) | Heterogeneous deep network method for non-European and Euclidean domain space spectrum feature learning | |
CN115984714B (en) | Cloud detection method based on dual-branch network model | |
CN116246171A (en) | Target detection method and device for air-spectrum multi-scale hyperspectral remote sensing image |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |