CN116309348A - Lunar south pole impact pit detection method based on improved TransUnet network - Google Patents
Lunar south pole impact pit detection method based on improved TransUnet network Download PDFInfo
- Publication number
- CN116309348A CN116309348A CN202310113929.4A CN202310113929A CN116309348A CN 116309348 A CN116309348 A CN 116309348A CN 202310113929 A CN202310113929 A CN 202310113929A CN 116309348 A CN116309348 A CN 116309348A
- Authority
- CN
- China
- Prior art keywords
- image
- transunet
- network
- model
- encoder
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000001514 detection method Methods 0.000 title claims description 31
- PEDCQBHIVMGVHV-UHFFFAOYSA-N Glycerine Chemical group OCC(O)CO PEDCQBHIVMGVHV-UHFFFAOYSA-N 0.000 claims abstract description 29
- 238000012549 training Methods 0.000 claims abstract description 23
- 239000011159 matrix material Substances 0.000 claims description 30
- 239000013598 vector Substances 0.000 claims description 16
- 230000006870 function Effects 0.000 claims description 12
- 238000005070 sampling Methods 0.000 claims description 12
- 238000000034 method Methods 0.000 claims description 11
- 238000012360 testing method Methods 0.000 claims description 11
- 230000011218 segmentation Effects 0.000 claims description 10
- 238000012795 verification Methods 0.000 claims description 8
- 230000008569 process Effects 0.000 claims description 6
- 230000002708 enhancing effect Effects 0.000 claims description 5
- 238000007781 pre-processing Methods 0.000 claims description 4
- 238000012545 processing Methods 0.000 claims description 4
- 238000012952 Resampling Methods 0.000 claims description 3
- 230000004913 activation Effects 0.000 claims description 3
- 238000010606 normalization Methods 0.000 claims description 3
- 238000011176 pooling Methods 0.000 claims description 3
- 238000000605 extraction Methods 0.000 abstract description 7
- 238000005286 illumination Methods 0.000 abstract description 5
- 238000003709 image segmentation Methods 0.000 abstract description 2
- 238000003058 natural language processing Methods 0.000 abstract 1
- 238000003672 processing method Methods 0.000 abstract 1
- 241000544051 Damasonium alisma Species 0.000 description 9
- 238000013527 convolutional neural network Methods 0.000 description 6
- 230000000694 effects Effects 0.000 description 5
- 238000011160 research Methods 0.000 description 4
- 230000009191 jumping Effects 0.000 description 3
- 230000003287 optical effect Effects 0.000 description 3
- 238000004458 analytical method Methods 0.000 description 2
- 230000015556 catabolic process Effects 0.000 description 2
- 238000013135 deep learning Methods 0.000 description 2
- 238000013136 deep learning model Methods 0.000 description 2
- 238000006731 degradation reaction Methods 0.000 description 2
- 230000007246 mechanism Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- IXUZXIMQZIMPSQ-ZBRNBAAYSA-N [(4s)-4-amino-4-carboxybutyl]azanium;(2s)-2-amino-4-hydroxy-4-oxobutanoate Chemical compound OC(=O)[C@@H](N)CCC[NH3+].[O-]C(=O)[C@@H](N)CC(O)=O IXUZXIMQZIMPSQ-ZBRNBAAYSA-N 0.000 description 1
- 230000006978 adaptation Effects 0.000 description 1
- 230000002776 aggregation Effects 0.000 description 1
- 238000004220 aggregation Methods 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000013145 classification model Methods 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000008034 disappearance Effects 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 238000004880 explosion Methods 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 230000004927 fusion Effects 0.000 description 1
- QSHDDOUJBYECFT-UHFFFAOYSA-N mercury Chemical compound [Hg] QSHDDOUJBYECFT-UHFFFAOYSA-N 0.000 description 1
- 238000013508 migration Methods 0.000 description 1
- 230000005012 migration Effects 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 230000000877 morphologic effect Effects 0.000 description 1
- 210000002569 neuron Anatomy 0.000 description 1
- 238000012876 topography Methods 0.000 description 1
- 238000013526 transfer learning Methods 0.000 description 1
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/0002—Inspection of images, e.g. flaw detection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/25—Determination of region of interest [ROI] or a volume of interest [VOI]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/26—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/44—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/764—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/774—Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10032—Satellite or aerial image; Remote sensing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02A—TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
- Y02A90/00—Technologies having an indirect contribution to adaptation to climate change
- Y02A90/10—Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Computation (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Computing Systems (AREA)
- Software Systems (AREA)
- General Health & Medical Sciences (AREA)
- Medical Informatics (AREA)
- Databases & Information Systems (AREA)
- Quality & Reliability (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Molecular Biology (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Image Analysis (AREA)
Abstract
Aiming at the problems that the impact pit is difficult to extract by an image processing method and a Digital Elevation Model (DEM) cannot extract a small impact pit due to the illumination influence of a moon region, the invention provides a small meteorite pit extraction algorithm AM-Transunet+ applied to a moon south pole region of interest in the future based on a natural language processing transform model and an image segmentation convolution network unet+ model. The algorithm is based on a narrow angle camera image (LRO-NAC, 0.5 m/pixel) of a lunar survey orbit device, a convolution block attention module and a depth separable convolution are added into a core Transunet model, so that the model convergence speed and the model improvement precision can be further improved, and the number of training parameters is greatly reduced without reducing the model accuracy. And the accuracy of the algorithm is verified through the Mars surface image and the Mars surface image, and the experimental result shows that the AM-TransUNet+ algorithm can show better mobility and accuracy in different deep space exploration remote sensing data.
Description
Technical Field
The invention relates to the technical field of remote sensing image processing, in particular to a moon south pole impact pit detection method based on an improved TransUnet network.
Background
The striking pit is the most remarkable landform form of the lunar surface, has various types and shapes, and has an annular pit structure with different sizes and uneven aggregation degree. The research on the impact pit can not only infer the relative geological age, the earth surface characteristics and the inversion water ice existence of the lunar surface, but also be applied to spacecraft positioning navigation, lunar base site selection, lunar path obstacle avoidance and the like.
In recent years, lunar exploration plans are established in many countries in the world, the accurate and rapid identification of an impact pit is always the focus of research in the deep space exploration field, and a series of lunar surface merle pit extraction algorithms (Crater Detection Algorithm, CDA) are also proposed by many researchers, and the development process of the lunar surface merle pit extraction algorithms is shown in fig. 1. These methods can be broadly divided into two categories, depending on the source of the data for these algorithms: based on the moon surface image observed by the optical sensor, and a Digital Elevation Model (DEM) obtained by scanning with a laser altimeter. The method comprises the steps of constructing and training a U-Net segmentation model by utilizing Keras and Tensorflow based on DEM data of a moon survey orbit (LRO) and a moon goddess detector (Kaguya), realizing information fusion between bottom layer features and high layer features by utilizing jump connection, and identifying the merle pit of approximately 92% by combining Povilaitis (merle pit diameter 5-20 km) and Head (merle pit diameter >20 km) databases. Wang Yiran et al developed a DEM-based CDA for use in a LOLA DEM to detect impingement pits with 3D morphological features such as edge height, internal slope and depth, with a 100-thousand merle pit data set (LU 1319373) of over 1 km constructed with a detection rate of about 85%. Salih et al studied the degradation characteristics of the crash pit based on the detection of the crash pit using YOLOv3 in six mid-latitude areas of varying illumination levels based on the lunar survey orbital narrow angle camera image (LRO NAC). Yang et al uses Domain Adaptation (DA) to efficiently detect unlabeled real data samples based on LRO NAC images and proposes a new network craterdant to extract the goddess four landing zone crash pits and create a new lunar crash pit dataset containing 20,000 crash pits.
Due to the high resolution of the image observed by the optical sensor, small crash pits can be detected. However, the optical sensor is susceptible to illumination, and therefore it is difficult to classify the merle by image processing. For example, under severe lighting conditions, the merle may be either invisible or only partially visible. Because DEM provides altitude information, the merle can be found by using a threshold filtering method. But DEM's resolution is lower and therefore may not display small merle pits. In summary, although different sized merle detection methods have been disclosed in the prior art, the following problems remain for small sized merles on the order of meters or hundred meters in diameter:
1. unlike medium and low latitude areas, the moon has large fluctuation of the south pole topography, large shadow range and dynamic change, extremely uneven illumination conditions, and the high-resolution image is more suitable for detecting the small-scale merle pits in the meter level than the DEM with low resolution. However, current research is only directed to experiments in low and medium latitude areas, and extracting small impingement pits of lunar regions is challenging.
2. The supervised approach learns features from a labeled training dataset that requires a large number of samples. In order to obtain good detection results, very good artificial marking training data is required to adapt to various conditions, such as terrain, lighting conditions, degradation, etc.
3. The deep learning method breaks through in the accuracy and reliability of detecting the multi-scale merle, especially for large-scale merle even in the global scope, but has no excellent performance in detecting small-scale merle. The reason is that the top neuron receptive field is large, and the information of the small-scale target is not completely stored.
Disclosure of Invention
Aiming at the problems, the invention provides a moon south pole impact pit detection method based on an improved TransUnet network, wherein a Convolution Block Attention Module (CBAM), a Depth Separable Convolution (DSC) and an enhancement module are added to the network on the basis of the TransUnet. The model can simultaneously carry out channel and space attention, so that the channel and space relation of the features can be better explored. And the outline characteristics of the impact pit are captured more effectively, so that smaller and overlapped merle pits under different illumination azimuth angles of the south poles of the moon can be recognized better.
The technical solution for realizing the purpose of the invention is as follows:
the moon south pole impact pit detection method based on the improved TransUnet network is characterized by comprising the following steps of:
step 1: acquiring a moon south pole LRO NAC image;
step 2: preprocessing an input LRO NAC image;
step 3: based on the pre-processed LRO NAC image, manually identifying real meteorites in ArcGIS software by using CraterTools plugins as tag data, and dividing the obtained LRO NAC image sample data into a training set, a verification set and a test set;
step 4: constructing an AM-TransUnet+ network model, and training on a training set;
step 5: updating the weight matrix through back propagation, and repeating the step 4 until the AM-TransUnet+network precision reaches preset precision or loss function convergence on the verification set;
step 6: testing the trained AM-TransUnet+ network model on a test set;
step 7: inputting the moon south pole image to be detected into a trained AM-TransUnet+ network, and finally outputting a detection result.
Further, the specific operation steps of the step 2 include:
step 21: performing random sliding window sampling on LRO NAC image data, and randomly scaling the size of a sliding window;
step 22: resampling NAC window data obtained by random sampling to a fixed size;
step 23: and (5) performing inversion and mirror image processing on the resampled LRO NAC image.
Further, the AM-TransUnet+ network model constructed in the step 4 comprises an encoder, a decoder and an enhancement module, wherein the encoder is used for encoding the characteristics of the input image vector; the decoder is used for decoding the characteristics of the image vector; the enhancement module is used for enhancing the characteristics of the coded image vector.
Further, the encoder comprises a convolution block attention module, a CNN convolution module and a transducer layer; firstly, generating image block weights from column vectors, constructing a scoring matrix, and summing all weight matrixes obtained by the scoring matrix in MSA of a transducer layer to obtain the weight matrix of the transducer layer; then enhancing each encoder feature by a weight matrix in an enhancement module and transmitting the skipped features to a decoder; finally, the weight matrix is up-sampled in the decoder to a size corresponding to the skipped feature by cascading up-samplers.
Further, the specific steps for detecting the AM-transuret+ network in step 7 include:
step 71: inputting a moon south pole image to be detected;
step 72: performing three downsampling in an encoder to respectively obtain corresponding feature matrixes, wherein each downsampling process comprises one convolution, regularization, reLU activation and a maximum pooling layer;
step 73: after obtaining the feature map extracted by CNN in the encoder, obtaining the spatial information of the image block by adding position embedding in the image block embedding, the output of the first layer encoder is expressed as:
r l '=MSA(LS(r l-1 ))+r l-1
r l =MLP(LS(r l '))+r l '
wherein MSA (-) is a multiple self-care header operation, MLP (-) is a multi-layer perceptron operation, LS (-) represents a layer normalization operation, r l A feature representation representing a layer-1 transformer reconstruction;
step 74: performing multi-stage up-sampling and decoding in a decoder through a cascade up-sampler, and outputting a segmentation result by utilizing the characteristics;
step 75: and generating a final segmentation mask by fusing the characteristics transmitted by the encoder, and identifying the small impact pits in the lunar NAC image according to the segmentation mask.
Further, when the AM-transureet+network model is trained in step 4, the initial learning rate is set to 0.001, the number of model training iterations is set to 300, the number of filters is set to 112, the filter length is set to 3, and the drop value is set to 0.15.
Further, the AM-transuret+ network model is trained using Binary Cross Entropy (BCE) loss functions:
loss=y i -y i t i +log(1+exp(-y i ))
wherein y is i Is the label of pixel i in AM-TransUNet+ prediction result, t i Is the label of this pixel in the ground truth.
The beneficial effects are that:
firstly, the invention provides a backbone network adapting to a lunar south tiny impact pit, which mainly utilizes the backbone network to extract abundant context information to detect the impact pit while maintaining detailed information, and is very effective in improving detection effect.
Second, the invention enhances the jumping function through the redesigned jumping connection, and can extract the outline information of the small-sized impact pit in the region of interest of the south pole of the moon by combining the score matrix column vector with the jumping connection. Furthermore, the CBAM module in combination with the DSC module in classical transune can improve performance while significantly reducing the number of parameters of the model encoder section.
Thirdly, the migration effect of the model is very remarkable, and good effect is achieved in spark and spark impact pit identification.
Drawings
FIG. 1 is a time axis for identifying and developing an impact pit at home and abroad;
FIG. 2 is a flow chart of the present invention;
FIG. 3 is an AM-TransUNet+ model architecture;
FIG. 4 is an AM-TransUNet+moon south pole region of interest impingement pit extraction result; fig. 4 (a) and 4 (b) are input images and detection results of the region around Haworth, respectively; fig. 4 (c) and fig. 4 (d) are respectively input images and detection results of regions near Amundsen; fig. 4 (e) and fig. 4 (f) are input images of the Shackton vicinity and detection results, respectively; fig. 4 (g) and 4 (h) are respectively an input image of the area around fauustini and a detection result;
FIG. 5 is an AM-TransUNet+ network Epoch-Loss curve;
FIG. 6 shows the results of the AM-TransUNet+ detection on different data; fig. 6 (a) and 6 (b) are respectively input images of sparks and detection results; fig. 6 (c) and 6 (d) are respectively input images of the water star and detection results;
fig. 7 is a schematic diagram of a jump connection of the scoring matrix and the redesign.
Detailed Description
In order to enable those skilled in the art to better understand the technical solution of the present invention, the technical solution of the present invention is further described below with reference to the accompanying drawings and examples.
The invention provides an end-to-end moon south pole merle extraction algorithm (AM-TransUNet+), which mainly aims to detect multi-scale merles, especially small-scale merles, from moon south poles. Which comprises the following steps:
1. image preprocessing
Narrow Angle Camera (NAC) images from a lunar survey orbit (LRO) were acquired with a resolution of 0.5m/pixel. The High-resolution stereo camera image (High-Resolution Stereo Camera, HRSC) of the Mars express and the water star MESSENGER number detector image data (Mercury_MESSENGER_metallic_global_250m) are taken as verification data.
Since the surface NAC product data size of the moon is a fixed value, deep learning in image segmentation applications typically employs square image input. Thus, to accommodate the deep learning model, LRO NAC data is selected for random sliding window sampling, and to accommodate the different diameter merle pits, the sliding window size is also randomly scaled. And resampling the NAC window data obtained by random sampling to a fixed size, such as 114×114, so as to meet the input requirement of the deep learning model. In order to further increase training data and improve network recognition performance, data expansion operations such as flipping, mirroring, etc. are used for preprocessing.
2. Making data labels and packets
The data tag is based on the pre-processed LRO NAC image, and the CraterTools plugin is used for manually identifying the true merle in ArcGIS software as tag data. The verification data set and the test data are generated in the same way, and the area for generating the test image is different from the area for generating the training image, so that the test image takes 80% of the data set as the training set, 10% of the data set as the verification set and 10% of the data set as the test set in the training and verification data sets.
3. Construction of AM-TransUnet+moon south pole impact pit detection model
AM-transune+ consists of an encoder, a decoder and an enhancement module. The AM-TransUNet+ changes the encoder layer of the original TransUNet into attention module convolution, generates a weight matrix by using the column vector of the score matrix to strengthen the characteristics and improve the attention degree of the key patch, and then carries out up-sampling to finally realize the segmentation result consistent with the original image. As shown in fig. 7, the score matrix and the redesigned jump connection are:
first, image block weights are generated from column vectors and constructed into a matrix. And summing all weight matrixes obtained by the fractional matrix in the MSA of the encoder converter to obtain the weight matrix of the converter layer. Since each transducer layer has an independent weight matrix, there are N weight matrices, which are summed to obtain the final weight matrix.
Wherein w is patch Is the weight of each image block, M i Is the ith th The scoring matrix in layer transform, f (·) is an operation involving a column vector.
Second, the weight matrix is upsampled to a size corresponding to the skipped feature. Each encoder feature is then enhanced by a weight matrix and the skipped features are transmitted to the decoder.
F out =F in ×ups(W patch )
Wherein F is in And F out Features before and after passing through the enhancement module, respectively. Wpatch is the weightA matrix; the ups (-) is an up-sampling operation.
The AM-TransUnet+ network specific flow comprises the following steps:
(1) The moon south pole image with the input image size of 144 multiplied by 144 is respectively obtained by three downsampling, and the corresponding feature matrixes F1, F2 and F3 are respectively (x/2, y/2 and c), (x/4, y/4 and c) and (x/8 and y/8,c), wherein c is the channel number. Each downsampling process includes a convolution, regularization, reLU activation, and max pooling layer.
(2) In the encoder path, CBAM is incorporated into the CNN portion of the CNN-transform blend layer, and after obtaining the CNN extracted feature map, the feature is mapped to the new embedding space by trainable linear projection. By adding position embedding in image block embedding, image block spatial information y can be obtained input 。
y input =[x 1 E;x 2 E;...;x n E]+E loc
Wherein y is input Is the input of the transducer layer, x i Is a feature image block extracted by CNN, E is linear projection, E loc Representative position embedding, n is the number of slices, and [; …;]is a series operation.
In fig. 3 (left), each encoder fransformer module comprises a layer specification, a multi-headed self-attention Module (MSA) comprising a plurality of self-attention modules, a multi-layer perceptron (MLP), and a residual connection, and in each layer of the fransformer network, vectors before the self-attention mechanism or before the feedforward neural network are introduced through the residual connection (residual connection) for enhancing the output result vectors of the self-attention mechanism or the feedforward network. The output of the layer one encoder can be expressed as:
r l '=MSA(LS(r l-1 ))+r l-1
r l =MLP(LS(r l '))+r l '
wherein MSA (-) is a multiple self-care header operation, MLP (-) is a multi-layer perceptron operation, LS (-) represents a layer normalization operation, r l A representation of the features representing the reconstruction of the layer i transformer.
(3) In the decoder, a cascade up-sampler is used for multi-stage up-sampling and decoding, and the segmentation result is output using the features. After all convolutions, the CBAM module contains a 2-fold up-sampling, a feature concatenation and a convolution operator per decoder block per layer, as shown in fig. 3 (right). After decoding the block, the length and width of the feature are doubled and the number of channels is reduced to half. When passing through all three decoder blocks, the length and width are half of the original image. Finally, the result is obtained by a 1×1 convolution layer
In AM-transune+, the convolution block attention module is a lightweight generic attention module of the feedforward convolutional neural network. Given an intermediate feature map, the module in turn extrapolates a two-dimensional attention map that can be used to extract features. The enhancement module uses the column vectors of the score matrix to enhance the skip function, thereby redesigning the skip connection. The final segmentation mask may be generated by fusing the features transmitted by the encoder, by which small impingement pits in the lunar NAC image can be identified.
4. Model training
And sending the moon south pole image training set into an AM-TransUNet+ network. The initial learning rate was set to 0.001, the number of model training iterations was set to 300, the number of filters was set to 112, the filter length was set to 3, and the drop value was set to 0.15.
5. Calculating a loss function
The essence of the AM-TransUNet + network prediction is to determine if each pixel is at the edge of an impingement pit. It is essentially a binary classification problem. The penalty function used in the AM-transune+ training process is the Binary Cross Entropy (BCE) penalty:
loss=y i -y i t i +log(1+exp(-y i ))
wherein y is i Is the label of pixel i in AM-TransUNet+ prediction result, t i Is the label of this pixel in the ground truth. The loss of an image is the sum of all pixel losses. The loss function value will be larger if the difference between the predicted image and the marked image is larger.
6. Counter-propagation
And calculating the gradient of the loss function on each parameter, and correcting and updating the parameter according to the gradient.
7. Updating weight matrix
And updating the weight matrix of the directional propagation according to the gradient of the parameter obtained by the reverse propagation, so as to achieve the effect of reducing the loss function.
8. Impact pit detection and result output
Loading the moon south pole to-be-detected image into the trained model to obtain a final target detection result image.
9. Application of transfer learning in Mars and water stars
And loading images to be detected of the Mars and the water stars into the trained model to obtain images of detection results of the collision pits of the Mars and the water stars.
10. Model evaluation
To evaluate the performance of the crash pit recognition algorithm, the algorithm was fully tested using a precision-recall (P-R) curve, average Precision (AP) values.
Wherein N is tp Is to identify the correct number of impact pits, N fp Is to identify the number of false crash pits.
Recall (Recall) in the P-R curve represents the missing rate of the algorithm as:
where Nfn is a missed merle target. The P-R curve is formed by taking Precision as a vertical axis and Recall as a horizontal axis and fitting by changing a threshold condition, and in addition, in order to embody the accuracy of identifying an impact pit, the IOU of a predicted position and a target real position needs to be considered when the P-R curve is calculated, and the IOU is set to be 0.5 when the P-R curve is calculated in the invention.
The F1 value is an index used in statistics to measure accuracy of the classification model, and can measure accuracy of the model or can relate to recall of the model. The F1 value can be regarded as a weighted average of the model accuracy and recall, and is formulated as:
wherein: p is the accuracy; r is the recall rate.
Examples
To verify the overall performance of the proposed algorithm, the analysis was performed from several aspects.
1. Impact pit extraction results
Fig. 4 shows the extraction result of the impact pit of the area of interest of the south pole of the moon, and as can be seen from fig. 4 (b), (d), (f) and (h), the AE-transune+ network enhances the jump function by redesigned jump connection, and the small impact pit profile information of the area of interest of the south pole of the moon can be extracted by combining the score matrix vectors with the jump connection.
2. Performance index comparison
Table 1 shows the comparison results of the performance indexes of UNet+, transUNet+ and the AM-TransUNet+ algorithm provided by the invention, and as can be seen from Table 1, the recall rate of the AM-TransUNet+ network can reach 0.822, and the accuracy rate can reach 0.890. In addition, the network achieves better effect in about 150 batches, and the total parameter quantity is 0.98M, so that the total parameter quantity is reduced by more parameter quantity than TransUNet+ and is consistent with the latest research.
Table 1 comparison of different algorithm performance indicators
To understand the training process of the AM-TransUNet+ network, an Epoch-Loss curve was drawn for analysis, as shown in FIG. 5. As can be seen from fig. 5, the AM-transune + network reaches convergence at approximately 250 epochs, which is faster. The network training speed is high, the training time can be greatly shortened, the introduced transducer model does not cause gradient disappearance or gradient explosion, and the applicability of the model is high.
3. Mobility of models
To verify the mobility of the model, tests were performed on Mars, water star different data sources, the results of which are shown in FIG. 6. As can be seen from fig. 6, for the heterologous data, in the identification of the collision pits on the surface of the spark, the model can detect the collision pits with different dimensions, and the overlapping collision pits also have a certain detection rate. For the identification of the collision pits on the surface of the water star, the model can detect a certain number of collision pits in spite of the difference of the characteristics of the water star landform, and lays a foundation for further researching the geological structure of the surface of the water star.
What is not described in detail in this specification is prior art known to those skilled in the art. Although the present invention has been described with reference to the foregoing embodiments, it will be apparent to those skilled in the art that modifications may be made to the embodiments described, or equivalents may be substituted for elements thereof, and any modifications, equivalents, improvements and changes may be made without departing from the spirit and principles of the present invention.
Claims (7)
1. The moon south pole impact pit detection method based on the improved TransUnet network is characterized by comprising the following steps of:
step 1: acquiring a moon south pole LRONAC image;
step 2: preprocessing an input LRONAC image;
step 3: based on the preprocessed LRONAC image, using CraterTools plugins to manually identify real meteorites in ArcGIS software as tag data, and dividing the obtained LRONAC image sample data into a training set, a verification set and a test set;
step 4: constructing an AM-TransUnet+ network model, and training on a training set;
step 5: updating the weight matrix through back propagation, and repeating the step 4 until the AM-TransUnet+network precision reaches preset precision or loss function convergence on the verification set;
step 6: testing the trained AM-TransUnet+ network model on a test set;
step 7: inputting the moon south pole image to be detected into a trained AM-TransUnet+ network, and finally outputting a detection result.
2. The lunar south pole impact pit detection method based on the improved transune network as claimed in claim 1, wherein the specific operation steps of the step 2 comprise:
step 21: performing random sliding window sampling on the LRONAC image data, and randomly scaling the size of a sliding window;
step 22: resampling NAC window data obtained by random sampling to a fixed size;
step 23: and (5) performing inversion and mirror image processing on the resampled LRONAC image.
3. The lunar antarctic impact pit detection method based on the improved TransUnet network as claimed in claim 1, wherein the AM-TransUnet+ network model constructed in the step 4 comprises an encoder, a decoder and an enhancement module, wherein the encoder is used for encoding the characteristics of the input image vector; the decoder is used for decoding the characteristics of the image vector; the enhancement module is used for enhancing the characteristics of the coded image vector.
4. The lunar antarctic crash pit detection method based on the improved transune network of claim 3, wherein the encoder comprises a convolution block attention module, a CNN convolution module, a Transformer layer; firstly, generating image block weights from column vectors, constructing a scoring matrix, and summing all weight matrixes obtained by the scoring matrix in MSA of a transducer layer to obtain the weight matrix of the transducer layer; then enhancing each encoder feature by a weight matrix in an enhancement module and transmitting the skipped features to a decoder; finally, the weight matrix is up-sampled in the decoder to a size corresponding to the skipped feature by cascading up-samplers.
5. The lunar antarctic crash pit detection method based on the improved transuret network as claimed in claim 4, wherein the specific step of detecting the AM-transuret+ network in step 7 comprises the following steps:
step 71: inputting a moon south pole image to be detected;
step 72: performing three downsampling in an encoder to respectively obtain corresponding feature matrixes, wherein each downsampling process comprises one convolution, regularization, reLU activation and a maximum pooling layer;
step 73: after obtaining the feature map extracted by CNN in the encoder, obtaining the spatial information of the image block by adding position embedding in the image block embedding, the output of the first layer encoder is expressed as:
r l '=MSA(LS(r l-1 ))+r l-1
r l =MLP(LS(r l '))+r l '
wherein MSA (-) is a multiple self-care header operation, MLP (-) is a multi-layer perceptron operation, LS (-) represents a layer normalization operation, r l A feature representation representing a layer-1 transformer reconstruction;
step 74: performing multi-stage up-sampling and decoding in a decoder through a cascade up-sampler, and outputting a segmentation result by utilizing the characteristics;
step 75: and generating a final segmentation mask by fusing the characteristics transmitted by the encoder, and identifying the small impact pits in the lunar NAC image according to the segmentation mask.
6. The method for detecting the moon south pole collision pit based on the improved transultraviolet network according to claim 1, wherein when an AM-transultraviolet+ network model is trained in the step 4, the initial learning rate is set to be 0.001, the number of model training iterations is set to be 300, the number of filters is set to be 112, the filter length is set to be 3, and the drop value is set to be 0.15.
7. The lunar antarctic crash pit detection method based on the improved transuret network of claim 1, wherein the AM-transuret+ network model is trained using Binary Cross Entropy (BCE) loss function:
loss=y i -y i t i +log(1+exp(-y i ))
wherein y is i Is the label of pixel i in AM-TransUNet+ prediction result, t i Is the label of this pixel in the ground truth.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310113929.4A CN116309348A (en) | 2023-02-15 | 2023-02-15 | Lunar south pole impact pit detection method based on improved TransUnet network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310113929.4A CN116309348A (en) | 2023-02-15 | 2023-02-15 | Lunar south pole impact pit detection method based on improved TransUnet network |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116309348A true CN116309348A (en) | 2023-06-23 |
Family
ID=86793281
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310113929.4A Pending CN116309348A (en) | 2023-02-15 | 2023-02-15 | Lunar south pole impact pit detection method based on improved TransUnet network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116309348A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117274823A (en) * | 2023-11-21 | 2023-12-22 | 成都理工大学 | Visual transducer landslide identification method based on DEM feature enhancement |
CN117809190A (en) * | 2024-02-23 | 2024-04-02 | 吉林大学 | Impact pit sputter identification method based on deep learning |
-
2023
- 2023-02-15 CN CN202310113929.4A patent/CN116309348A/en active Pending
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117274823A (en) * | 2023-11-21 | 2023-12-22 | 成都理工大学 | Visual transducer landslide identification method based on DEM feature enhancement |
CN117274823B (en) * | 2023-11-21 | 2024-01-26 | 成都理工大学 | Visual transducer landslide identification method based on DEM feature enhancement |
CN117809190A (en) * | 2024-02-23 | 2024-04-02 | 吉林大学 | Impact pit sputter identification method based on deep learning |
CN117809190B (en) * | 2024-02-23 | 2024-05-24 | 吉林大学 | Impact pit sputter identification method based on deep learning |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Li et al. | TransUNetCD: A hybrid transformer network for change detection in optical remote-sensing images | |
CN110136170B (en) | Remote sensing image building change detection method based on convolutional neural network | |
CN112949549B (en) | Super-resolution-based change detection method for multi-resolution remote sensing image | |
CN116309348A (en) | Lunar south pole impact pit detection method based on improved TransUnet network | |
CN114359130B (en) | Road crack detection method based on unmanned aerial vehicle image | |
CN111079739B (en) | Multi-scale attention feature detection method | |
CN116229295A (en) | Remote sensing image target detection method based on fusion convolution attention mechanism | |
CN111929683B (en) | Landslide deformation accumulation area prediction model generation method and landslide deformation accumulation area prediction method | |
CN111079683A (en) | Remote sensing image cloud and snow detection method based on convolutional neural network | |
CN112287983B (en) | Remote sensing image target extraction system and method based on deep learning | |
CN111985374A (en) | Face positioning method and device, electronic equipment and storage medium | |
CN117132914B (en) | Method and system for identifying large model of universal power equipment | |
CN115984714B (en) | Cloud detection method based on dual-branch network model | |
CN111598155A (en) | Fine-grained image weak supervision target positioning method based on deep learning | |
CN117557775B (en) | Substation power equipment detection method and system based on infrared and visible light fusion | |
CN116309536A (en) | Pavement crack detection method and storage medium | |
Yang et al. | Towards better classification of land cover and land use based on convolutional neural networks | |
CN115861756A (en) | Earth background small target identification method based on cascade combination network | |
CN117475216A (en) | Hyperspectral and laser radar data fusion classification method based on AGLT network | |
CN116310828A (en) | High-resolution remote sensing image change detection method and device combining transducer and CNN | |
CN115497002A (en) | Multi-scale feature fusion laser radar remote sensing classification method | |
CN117437523B (en) | Weak trace detection method combining SAR CCD and global information capture | |
CN112633123A (en) | Heterogeneous remote sensing image change detection method and device based on deep learning | |
CN117274375A (en) | Target positioning method and system based on transfer learning network model and image matching | |
Drees et al. | Multi-modal deep learning with sentinel-3 observations for the detection of oceanic internal waves |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |