CN116309348A

CN116309348A - Lunar south pole impact pit detection method based on improved TransUnet network

Info

Publication number: CN116309348A
Application number: CN202310113929.4A
Authority: CN
Inventors: 贾玉童; 苏芝娟; 彭思卿; 万刚; 刘佳; 刘磊; 汪国平; 刘伟; 武易天; 李功; 谢珠利; 王振宇; 李矗
Original assignee: Peoples Liberation Army Strategic Support Force Aerospace Engineering University
Current assignee: Peoples Liberation Army Strategic Support Force Aerospace Engineering University
Priority date: 2023-02-15
Filing date: 2023-02-15
Publication date: 2023-06-23

Abstract

Aiming at the problems that the impact pit is difficult to extract by an image processing method and a Digital Elevation Model (DEM) cannot extract a small impact pit due to the illumination influence of a moon region, the invention provides a small meteorite pit extraction algorithm AM-Transunet+ applied to a moon south pole region of interest in the future based on a natural language processing transform model and an image segmentation convolution network unet+ model. The algorithm is based on a narrow angle camera image (LRO-NAC, 0.5 m/pixel) of a lunar survey orbit device, a convolution block attention module and a depth separable convolution are added into a core Transunet model, so that the model convergence speed and the model improvement precision can be further improved, and the number of training parameters is greatly reduced without reducing the model accuracy. And the accuracy of the algorithm is verified through the Mars surface image and the Mars surface image, and the experimental result shows that the AM-TransUNet+ algorithm can show better mobility and accuracy in different deep space exploration remote sensing data.

Description

Lunar south pole impact pit detection method based on improved TransUnet network

Technical Field

The invention relates to the technical field of remote sensing image processing, in particular to a moon south pole impact pit detection method based on an improved TransUnet network.

Background

The striking pit is the most remarkable landform form of the lunar surface, has various types and shapes, and has an annular pit structure with different sizes and uneven aggregation degree. The research on the impact pit can not only infer the relative geological age, the earth surface characteristics and the inversion water ice existence of the lunar surface, but also be applied to spacecraft positioning navigation, lunar base site selection, lunar path obstacle avoidance and the like.

In recent years, lunar exploration plans are established in many countries in the world, the accurate and rapid identification of an impact pit is always the focus of research in the deep space exploration field, and a series of lunar surface merle pit extraction algorithms (Crater Detection Algorithm, CDA) are also proposed by many researchers, and the development process of the lunar surface merle pit extraction algorithms is shown in fig. 1. These methods can be broadly divided into two categories, depending on the source of the data for these algorithms: based on the moon surface image observed by the optical sensor, and a Digital Elevation Model (DEM) obtained by scanning with a laser altimeter. The method comprises the steps of constructing and training a U-Net segmentation model by utilizing Keras and Tensorflow based on DEM data of a moon survey orbit (LRO) and a moon goddess detector (Kaguya), realizing information fusion between bottom layer features and high layer features by utilizing jump connection, and identifying the merle pit of approximately 92% by combining Povilaitis (merle pit diameter 5-20 km) and Head (merle pit diameter >20 km) databases. Wang Yiran et al developed a DEM-based CDA for use in a LOLA DEM to detect impingement pits with 3D morphological features such as edge height, internal slope and depth, with a 100-thousand merle pit data set (LU 1319373) of over 1 km constructed with a detection rate of about 85%. Salih et al studied the degradation characteristics of the crash pit based on the detection of the crash pit using YOLOv3 in six mid-latitude areas of varying illumination levels based on the lunar survey orbital narrow angle camera image (LRO NAC). Yang et al uses Domain Adaptation (DA) to efficiently detect unlabeled real data samples based on LRO NAC images and proposes a new network craterdant to extract the goddess four landing zone crash pits and create a new lunar crash pit dataset containing 20,000 crash pits.

Due to the high resolution of the image observed by the optical sensor, small crash pits can be detected. However, the optical sensor is susceptible to illumination, and therefore it is difficult to classify the merle by image processing. For example, under severe lighting conditions, the merle may be either invisible or only partially visible. Because DEM provides altitude information, the merle can be found by using a threshold filtering method. But DEM's resolution is lower and therefore may not display small merle pits. In summary, although different sized merle detection methods have been disclosed in the prior art, the following problems remain for small sized merles on the order of meters or hundred meters in diameter:

1. unlike medium and low latitude areas, the moon has large fluctuation of the south pole topography, large shadow range and dynamic change, extremely uneven illumination conditions, and the high-resolution image is more suitable for detecting the small-scale merle pits in the meter level than the DEM with low resolution. However, current research is only directed to experiments in low and medium latitude areas, and extracting small impingement pits of lunar regions is challenging.

2. The supervised approach learns features from a labeled training dataset that requires a large number of samples. In order to obtain good detection results, very good artificial marking training data is required to adapt to various conditions, such as terrain, lighting conditions, degradation, etc.

3. The deep learning method breaks through in the accuracy and reliability of detecting the multi-scale merle, especially for large-scale merle even in the global scope, but has no excellent performance in detecting small-scale merle. The reason is that the top neuron receptive field is large, and the information of the small-scale target is not completely stored.

Disclosure of Invention

Aiming at the problems, the invention provides a moon south pole impact pit detection method based on an improved TransUnet network, wherein a Convolution Block Attention Module (CBAM), a Depth Separable Convolution (DSC) and an enhancement module are added to the network on the basis of the TransUnet. The model can simultaneously carry out channel and space attention, so that the channel and space relation of the features can be better explored. And the outline characteristics of the impact pit are captured more effectively, so that smaller and overlapped merle pits under different illumination azimuth angles of the south poles of the moon can be recognized better.

The technical solution for realizing the purpose of the invention is as follows:

the moon south pole impact pit detection method based on the improved TransUnet network is characterized by comprising the following steps of:

step 1: acquiring a moon south pole LRO NAC image;

step 2: preprocessing an input LRO NAC image;

step 3: based on the pre-processed LRO NAC image, manually identifying real meteorites in ArcGIS software by using CraterTools plugins as tag data, and dividing the obtained LRO NAC image sample data into a training set, a verification set and a test set;

step 4: constructing an AM-TransUnet+ network model, and training on a training set;

step 5: updating the weight matrix through back propagation, and repeating the step 4 until the AM-TransUnet+network precision reaches preset precision or loss function convergence on the verification set;

step 6: testing the trained AM-TransUnet+ network model on a test set;

step 7: inputting the moon south pole image to be detected into a trained AM-TransUnet+ network, and finally outputting a detection result.

Further, the specific operation steps of the step 2 include:

step 21: performing random sliding window sampling on LRO NAC image data, and randomly scaling the size of a sliding window;

step 22: resampling NAC window data obtained by random sampling to a fixed size;

step 23: and (5) performing inversion and mirror image processing on the resampled LRO NAC image.

Further, the AM-TransUnet+ network model constructed in the step 4 comprises an encoder, a decoder and an enhancement module, wherein the encoder is used for encoding the characteristics of the input image vector; the decoder is used for decoding the characteristics of the image vector; the enhancement module is used for enhancing the characteristics of the coded image vector.

Further, the encoder comprises a convolution block attention module, a CNN convolution module and a transducer layer; firstly, generating image block weights from column vectors, constructing a scoring matrix, and summing all weight matrixes obtained by the scoring matrix in MSA of a transducer layer to obtain the weight matrix of the transducer layer; then enhancing each encoder feature by a weight matrix in an enhancement module and transmitting the skipped features to a decoder; finally, the weight matrix is up-sampled in the decoder to a size corresponding to the skipped feature by cascading up-samplers.

Further, the specific steps for detecting the AM-transuret+ network in step 7 include:

step 71: inputting a moon south pole image to be detected;

step 72: performing three downsampling in an encoder to respectively obtain corresponding feature matrixes, wherein each downsampling process comprises one convolution, regularization, reLU activation and a maximum pooling layer;

step 73: after obtaining the feature map extracted by CNN in the encoder, obtaining the spatial information of the image block by adding position embedding in the image block embedding, the output of the first layer encoder is expressed as:

r _l '＝MSA(LS(r _l-1 ))+r _l-1

r _l ＝MLP(LS(r _l '))+r _l '

wherein MSA (-) is a multiple self-care header operation, MLP (-) is a multi-layer perceptron operation, LS (-) represents a layer normalization operation, r _l A feature representation representing a layer-1 transformer reconstruction;

step 74: performing multi-stage up-sampling and decoding in a decoder through a cascade up-sampler, and outputting a segmentation result by utilizing the characteristics;

step 75: and generating a final segmentation mask by fusing the characteristics transmitted by the encoder, and identifying the small impact pits in the lunar NAC image according to the segmentation mask.

Further, when the AM-transureet+network model is trained in step 4, the initial learning rate is set to 0.001, the number of model training iterations is set to 300, the number of filters is set to 112, the filter length is set to 3, and the drop value is set to 0.15.

Further, the AM-transuret+ network model is trained using Binary Cross Entropy (BCE) loss functions:

loss＝y _i -y _i t _i +log(1+exp(-y _i ))

wherein y is _i Is the label of pixel i in AM-TransUNet+ prediction result, t _i Is the label of this pixel in the ground truth.

The beneficial effects are that:

firstly, the invention provides a backbone network adapting to a lunar south tiny impact pit, which mainly utilizes the backbone network to extract abundant context information to detect the impact pit while maintaining detailed information, and is very effective in improving detection effect.

Second, the invention enhances the jumping function through the redesigned jumping connection, and can extract the outline information of the small-sized impact pit in the region of interest of the south pole of the moon by combining the score matrix column vector with the jumping connection. Furthermore, the CBAM module in combination with the DSC module in classical transune can improve performance while significantly reducing the number of parameters of the model encoder section.

Thirdly, the migration effect of the model is very remarkable, and good effect is achieved in spark and spark impact pit identification.

Drawings

FIG. 1 is a time axis for identifying and developing an impact pit at home and abroad;

FIG. 2 is a flow chart of the present invention;

FIG. 3 is an AM-TransUNet+ model architecture;

FIG. 4 is an AM-TransUNet+moon south pole region of interest impingement pit extraction result; fig. 4 (a) and 4 (b) are input images and detection results of the region around Haworth, respectively; fig. 4 (c) and fig. 4 (d) are respectively input images and detection results of regions near Amundsen; fig. 4 (e) and fig. 4 (f) are input images of the Shackton vicinity and detection results, respectively; fig. 4 (g) and 4 (h) are respectively an input image of the area around fauustini and a detection result;

FIG. 5 is an AM-TransUNet+ network Epoch-Loss curve;

FIG. 6 shows the results of the AM-TransUNet+ detection on different data; fig. 6 (a) and 6 (b) are respectively input images of sparks and detection results; fig. 6 (c) and 6 (d) are respectively input images of the water star and detection results;

fig. 7 is a schematic diagram of a jump connection of the scoring matrix and the redesign.

Detailed Description

In order to enable those skilled in the art to better understand the technical solution of the present invention, the technical solution of the present invention is further described below with reference to the accompanying drawings and examples.

The invention provides an end-to-end moon south pole merle extraction algorithm (AM-TransUNet+), which mainly aims to detect multi-scale merles, especially small-scale merles, from moon south poles. Which comprises the following steps:

1. image preprocessing

Narrow Angle Camera (NAC) images from a lunar survey orbit (LRO) were acquired with a resolution of 0.5m/pixel. The High-resolution stereo camera image (High-Resolution Stereo Camera, HRSC) of the Mars express and the water star MESSENGER number detector image data (Mercury_MESSENGER_metallic_global_250m) are taken as verification data.

Since the surface NAC product data size of the moon is a fixed value, deep learning in image segmentation applications typically employs square image input. Thus, to accommodate the deep learning model, LRO NAC data is selected for random sliding window sampling, and to accommodate the different diameter merle pits, the sliding window size is also randomly scaled. And resampling the NAC window data obtained by random sampling to a fixed size, such as 114×114, so as to meet the input requirement of the deep learning model. In order to further increase training data and improve network recognition performance, data expansion operations such as flipping, mirroring, etc. are used for preprocessing.

2. Making data labels and packets

The data tag is based on the pre-processed LRO NAC image, and the CraterTools plugin is used for manually identifying the true merle in ArcGIS software as tag data. The verification data set and the test data are generated in the same way, and the area for generating the test image is different from the area for generating the training image, so that the test image takes 80% of the data set as the training set, 10% of the data set as the verification set and 10% of the data set as the test set in the training and verification data sets.

3. Construction of AM-TransUnet+moon south pole impact pit detection model

AM-transune+ consists of an encoder, a decoder and an enhancement module. The AM-TransUNet+ changes the encoder layer of the original TransUNet into attention module convolution, generates a weight matrix by using the column vector of the score matrix to strengthen the characteristics and improve the attention degree of the key patch, and then carries out up-sampling to finally realize the segmentation result consistent with the original image. As shown in fig. 7, the score matrix and the redesigned jump connection are:

first, image block weights are generated from column vectors and constructed into a matrix. And summing all weight matrixes obtained by the fractional matrix in the MSA of the encoder converter to obtain the weight matrix of the converter layer. Since each transducer layer has an independent weight matrix, there are N weight matrices, which are summed to obtain the final weight matrix.

Wherein w is _patch Is the weight of each image block, M _i Is the ith _th The scoring matrix in layer transform, f (·) is an operation involving a column vector.

Second, the weight matrix is upsampled to a size corresponding to the skipped feature. Each encoder feature is then enhanced by a weight matrix and the skipped features are transmitted to the decoder.

F _out ＝F _in ×ups(W _patch )

Wherein F is _in And F _out Features before and after passing through the enhancement module, respectively. Wpatch is the weightA matrix; the ups (-) is an up-sampling operation.

The AM-TransUnet+ network specific flow comprises the following steps:

(1) The moon south pole image with the input image size of 144 multiplied by 144 is respectively obtained by three downsampling, and the corresponding feature matrixes F1, F2 and F3 are respectively (x/2, y/2 and c), (x/4, y/4 and c) and (x/8 and y/8,c), wherein c is the channel number. Each downsampling process includes a convolution, regularization, reLU activation, and max pooling layer.

(2) In the encoder path, CBAM is incorporated into the CNN portion of the CNN-transform blend layer, and after obtaining the CNN extracted feature map, the feature is mapped to the new embedding space by trainable linear projection. By adding position embedding in image block embedding, image block spatial information y can be obtained _input 。

y _input ＝[x ₁ E；x ₂ E；...；x _n E]+E _loc

Wherein y is _input Is the input of the transducer layer, x _i Is a feature image block extracted by CNN, E is linear projection, E _loc Representative position embedding, n is the number of slices, and [; …;]is a series operation.

In fig. 3 (left), each encoder fransformer module comprises a layer specification, a multi-headed self-attention Module (MSA) comprising a plurality of self-attention modules, a multi-layer perceptron (MLP), and a residual connection, and in each layer of the fransformer network, vectors before the self-attention mechanism or before the feedforward neural network are introduced through the residual connection (residual connection) for enhancing the output result vectors of the self-attention mechanism or the feedforward network. The output of the layer one encoder can be expressed as:

r _l '＝MSA(LS(r _l-1 ))+r _l-1

r _l ＝MLP(LS(r _l '))+r _l '

wherein MSA (-) is a multiple self-care header operation, MLP (-) is a multi-layer perceptron operation, LS (-) represents a layer normalization operation, r _l A representation of the features representing the reconstruction of the layer i transformer.

(3) In the decoder, a cascade up-sampler is used for multi-stage up-sampling and decoding, and the segmentation result is output using the features. After all convolutions, the CBAM module contains a 2-fold up-sampling, a feature concatenation and a convolution operator per decoder block per layer, as shown in fig. 3 (right). After decoding the block, the length and width of the feature are doubled and the number of channels is reduced to half. When passing through all three decoder blocks, the length and width are half of the original image. Finally, the result is obtained by a 1×1 convolution layer

In AM-transune+, the convolution block attention module is a lightweight generic attention module of the feedforward convolutional neural network. Given an intermediate feature map, the module in turn extrapolates a two-dimensional attention map that can be used to extract features. The enhancement module uses the column vectors of the score matrix to enhance the skip function, thereby redesigning the skip connection. The final segmentation mask may be generated by fusing the features transmitted by the encoder, by which small impingement pits in the lunar NAC image can be identified.

4. Model training

And sending the moon south pole image training set into an AM-TransUNet+ network. The initial learning rate was set to 0.001, the number of model training iterations was set to 300, the number of filters was set to 112, the filter length was set to 3, and the drop value was set to 0.15.

5. Calculating a loss function

The essence of the AM-TransUNet + network prediction is to determine if each pixel is at the edge of an impingement pit. It is essentially a binary classification problem. The penalty function used in the AM-transune+ training process is the Binary Cross Entropy (BCE) penalty:

loss＝y _i -y _i t _i +log(1+exp(-y _i ))

wherein y is _i Is the label of pixel i in AM-TransUNet+ prediction result, t _i Is the label of this pixel in the ground truth. The loss of an image is the sum of all pixel losses. The loss function value will be larger if the difference between the predicted image and the marked image is larger.

6. Counter-propagation

And calculating the gradient of the loss function on each parameter, and correcting and updating the parameter according to the gradient.

7. Updating weight matrix

And updating the weight matrix of the directional propagation according to the gradient of the parameter obtained by the reverse propagation, so as to achieve the effect of reducing the loss function.

8. Impact pit detection and result output

Loading the moon south pole to-be-detected image into the trained model to obtain a final target detection result image.

9. Application of transfer learning in Mars and water stars

And loading images to be detected of the Mars and the water stars into the trained model to obtain images of detection results of the collision pits of the Mars and the water stars.

10. Model evaluation

To evaluate the performance of the crash pit recognition algorithm, the algorithm was fully tested using a precision-recall (P-R) curve, average Precision (AP) values.

Wherein N is _tp Is to identify the correct number of impact pits, N _fp Is to identify the number of false crash pits.

Recall (Recall) in the P-R curve represents the missing rate of the algorithm as:

where Nfn is a missed merle target. The P-R curve is formed by taking Precision as a vertical axis and Recall as a horizontal axis and fitting by changing a threshold condition, and in addition, in order to embody the accuracy of identifying an impact pit, the IOU of a predicted position and a target real position needs to be considered when the P-R curve is calculated, and the IOU is set to be 0.5 when the P-R curve is calculated in the invention.

The F1 value is an index used in statistics to measure accuracy of the classification model, and can measure accuracy of the model or can relate to recall of the model. The F1 value can be regarded as a weighted average of the model accuracy and recall, and is formulated as:

wherein: p is the accuracy; r is the recall rate.

Examples

To verify the overall performance of the proposed algorithm, the analysis was performed from several aspects.

1. Impact pit extraction results

Fig. 4 shows the extraction result of the impact pit of the area of interest of the south pole of the moon, and as can be seen from fig. 4 (b), (d), (f) and (h), the AE-transune+ network enhances the jump function by redesigned jump connection, and the small impact pit profile information of the area of interest of the south pole of the moon can be extracted by combining the score matrix vectors with the jump connection.

2. Performance index comparison

Table 1 shows the comparison results of the performance indexes of UNet+, transUNet+ and the AM-TransUNet+ algorithm provided by the invention, and as can be seen from Table 1, the recall rate of the AM-TransUNet+ network can reach 0.822, and the accuracy rate can reach 0.890. In addition, the network achieves better effect in about 150 batches, and the total parameter quantity is 0.98M, so that the total parameter quantity is reduced by more parameter quantity than TransUNet+ and is consistent with the latest research.

Table 1 comparison of different algorithm performance indicators

To understand the training process of the AM-TransUNet+ network, an Epoch-Loss curve was drawn for analysis, as shown in FIG. 5. As can be seen from fig. 5, the AM-transune + network reaches convergence at approximately 250 epochs, which is faster. The network training speed is high, the training time can be greatly shortened, the introduced transducer model does not cause gradient disappearance or gradient explosion, and the applicability of the model is high.

3. Mobility of models

To verify the mobility of the model, tests were performed on Mars, water star different data sources, the results of which are shown in FIG. 6. As can be seen from fig. 6, for the heterologous data, in the identification of the collision pits on the surface of the spark, the model can detect the collision pits with different dimensions, and the overlapping collision pits also have a certain detection rate. For the identification of the collision pits on the surface of the water star, the model can detect a certain number of collision pits in spite of the difference of the characteristics of the water star landform, and lays a foundation for further researching the geological structure of the surface of the water star.

What is not described in detail in this specification is prior art known to those skilled in the art. Although the present invention has been described with reference to the foregoing embodiments, it will be apparent to those skilled in the art that modifications may be made to the embodiments described, or equivalents may be substituted for elements thereof, and any modifications, equivalents, improvements and changes may be made without departing from the spirit and principles of the present invention.

Claims

1. The moon south pole impact pit detection method based on the improved TransUnet network is characterized by comprising the following steps of:

step 1: acquiring a moon south pole LRONAC image;

step 2: preprocessing an input LRONAC image;

step 3: based on the preprocessed LRONAC image, using CraterTools plugins to manually identify real meteorites in ArcGIS software as tag data, and dividing the obtained LRONAC image sample data into a training set, a verification set and a test set;

step 6: testing the trained AM-TransUnet+ network model on a test set;

2. The lunar south pole impact pit detection method based on the improved transune network as claimed in claim 1, wherein the specific operation steps of the step 2 comprise:

step 21: performing random sliding window sampling on the LRONAC image data, and randomly scaling the size of a sliding window;

step 23: and (5) performing inversion and mirror image processing on the resampled LRONAC image.

3. The lunar antarctic impact pit detection method based on the improved TransUnet network as claimed in claim 1, wherein the AM-TransUnet+ network model constructed in the step 4 comprises an encoder, a decoder and an enhancement module, wherein the encoder is used for encoding the characteristics of the input image vector; the decoder is used for decoding the characteristics of the image vector; the enhancement module is used for enhancing the characteristics of the coded image vector.

4. The lunar antarctic crash pit detection method based on the improved transune network of claim 3, wherein the encoder comprises a convolution block attention module, a CNN convolution module, a Transformer layer; firstly, generating image block weights from column vectors, constructing a scoring matrix, and summing all weight matrixes obtained by the scoring matrix in MSA of a transducer layer to obtain the weight matrix of the transducer layer; then enhancing each encoder feature by a weight matrix in an enhancement module and transmitting the skipped features to a decoder; finally, the weight matrix is up-sampled in the decoder to a size corresponding to the skipped feature by cascading up-samplers.

5. The lunar antarctic crash pit detection method based on the improved transuret network as claimed in claim 4, wherein the specific step of detecting the AM-transuret+ network in step 7 comprises the following steps:

step 71: inputting a moon south pole image to be detected;

r _l '＝MSA(LS(r _l-1 ))+r _l-1

r _l ＝MLP(LS(r _l '))+r _l '

6. The method for detecting the moon south pole collision pit based on the improved transultraviolet network according to claim 1, wherein when an AM-transultraviolet+ network model is trained in the step 4, the initial learning rate is set to be 0.001, the number of model training iterations is set to be 300, the number of filters is set to be 112, the filter length is set to be 3, and the drop value is set to be 0.15.

7. The lunar antarctic crash pit detection method based on the improved transuret network of claim 1, wherein the AM-transuret+ network model is trained using Binary Cross Entropy (BCE) loss function:

loss＝y _i -y _i t _i +log(1+exp(-y _i ))