CN114842078A

CN114842078A - Dual-channel satellite attitude estimation network based on deep learning

Info

Publication number: CN114842078A
Application number: CN202210393253.4A
Authority: CN
Inventors: 任元; 叶瑞达; 王煜晶; 王卫杰; 宋铁岭; 王元钦; 刘通; 刘政良; 刘钰菲
Original assignee: Peoples Liberation Army Strategic Support Force Aerospace Engineering University
Current assignee: Peoples Liberation Army Strategic Support Force Aerospace Engineering University
Priority date: 2022-04-14
Filing date: 2022-04-14
Publication date: 2022-08-02

Abstract

A deep learning-based dual-channel satellite attitude estimation network. The network process includes: constructing and dividing a satellite attitude estimation data set, data preprocessing, extracting image features with the ResNet model, learning spatial position information using the improved Vision Transformer model, and using the Hourglass network to learn rotation information, respectively outputting satellite position information and Attitude information, calculate the distance between the output value and the label, and backpropagate it to the model for iterative training, obtain the optimal model, and use the test set to evaluate the model. The self-attention mechanism of the improved core module of the Vision Transformer model can focus on the position of the satellite in the image, and the Hourglass network learns repeated up-sampling and down-sampling processing methods to effectively infer the key points of the satellite contour. The invention uses the dual-channel network to separately learn the space position information and attitude information of the satellite, effectively avoids the mutual influence between the two kinds of information, improves the accuracy of satellite attitude estimation, and provides a new intelligent means for the detection of space non-cooperative targets.

Description

A Deep Learning-Based Dual-Channel Satellite Attitude Estimation Network

技术领域technical field

本发明涉及目标探测领域，特别涉及一种基于深度学习的双通道卫星姿态估计网络。The invention relates to the field of target detection, in particular to a dual-channel satellite attitude estimation network based on deep learning.

背景技术Background technique

基于视觉的卫星姿态估计是航天领域亟需解决的难题，其在导航、在轨维修和太空垃圾清理等方面都具有极具重要的应用价值。然而，基于纯视觉的卫星姿态估计面临着诸多需要解决的技术难题，如太空光照环境对相机成像带来了不便，卫星的空间位置信息和自身旋转信息的相互影响等，这些问题给基于纯视觉的卫星姿态估计解决方案带来了诸多挑战。Vision-based satellite attitude estimation is an urgent problem to be solved in the aerospace field, and it has extremely important application value in navigation, on-orbit maintenance and space junk cleaning. However, satellite attitude estimation based on pure vision faces many technical problems that need to be solved, such as the inconvenience caused by the space illumination environment for camera imaging, and the interaction between the satellite's spatial position information and its own rotation information. The solution for satellite attitude estimation in the world presents many challenges.

Vision Transformer模拟人脑的注意力机制，在学习图像特征时，可以自动聚焦到被检测物体，此方法可以有效识别卫星本体在所拍摄物体中的空间位置信息。Hourglass在图像关键点检测领域被广泛使用，该方法的特点是可以有效检测出卫星轮廓的关键点信息，通过对卫星轮廓的关键点的学习，可以解算出卫星在图像中相对于参考坐标的旋转角度，进而学习卫星的旋转信息。使用双通道网络，对卫星的空间位置信息和自身旋转信息进行有效解耦，降低两种信息之间的相互干扰。Vision Transformer simulates the attention mechanism of the human brain. When learning image features, it can automatically focus on the detected object. This method can effectively identify the spatial position information of the satellite body in the photographed object. Hourglass is widely used in the field of image key point detection. The feature of this method is that it can effectively detect the key point information of the satellite contour. By learning the key points of the satellite contour, the rotation of the satellite relative to the reference coordinates in the image can be calculated. angle, and then learn the rotation information of the satellite. The dual-channel network is used to effectively decouple the satellite's spatial position information and its own rotation information to reduce the mutual interference between the two kinds of information.

授权公布号CN 109827578 B的发明专利公开了“基于轮廓相似性的卫星相对姿态估计方法”，通过使用仿真图像进行投影，对卫星图像轮廓进行相似性分析，但是该方法在识别效率和鲁棒性上有所欠缺。授权公布号CN 105300384 B的发明专利公开了“一种用于卫星姿态确定的交互式滤波方法”，通过采集卫星传感器的相关数据，确定卫星姿态信息，该方法仅用于卫星本体对自身姿态的测量，无法识别非合作卫星姿态信息。The invention patent of the authorized publication number CN 109827578 B discloses a "satellite relative attitude estimation method based on contour similarity". By using the simulated image for projection, the similarity analysis of the satellite image contour is carried out, but the method is not effective in recognition efficiency and robustness. is lacking. The invention patent with the authorized publication number CN 105300384 B discloses "an interactive filtering method for satellite attitude determination", which determines satellite attitude information by collecting relevant data of satellite sensors. measurement, unable to identify the attitude information of non-cooperative satellites.

发明内容SUMMARY OF THE INVENTION

(一)发明目的(1) Purpose of the invention

本发明的目的是提供通过一种基于深度学习的双通道卫星姿态估计网络。本发明通过相机拍摄非合作的卫星图像，提出一种双通道深度学习网络，首先使用ResNet模型提取图像特征，然后使用双通道网络学习卫星的位置信息和旋转信息，其中使用改进VisionTransformer模型学习卫星空间位置信息，使用Hourglass网络学习卫星姿态信息，对卫星的空间位置信息和自身旋转信息进行有效解耦，为卫星姿态估计乃至非合作目标探测提供了新的解决方案。The purpose of the present invention is to provide a dual-channel satellite attitude estimation network based on deep learning. The invention shoots non-cooperative satellite images through cameras, and proposes a dual-channel deep learning network. First, the ResNet model is used to extract image features, and then the dual-channel network is used to learn the position information and rotation information of the satellites. The improved VisionTransformer model is used to learn the satellite space. Position information, using the Hourglass network to learn satellite attitude information, effectively decoupling the satellite's spatial position information and its own rotation information, providing a new solution for satellite attitude estimation and even non-cooperative target detection.

(二)技术方案(2) Technical solutions

本发明的技术解决方案，一种基于深度学习的双通道卫星姿态估计网络，其特征在于，包括：制作卫星姿态数据集，将数据集划分为训练集、验证集和测试集，对图像信息进行预处理，将预处理后的训练集和验证集输入至ResNet模型进行图像特征提取，使用改进Vision Transformer模型学习卫星空间位置信息，使用Hourglass网络学习卫星姿态信息，计算模型预测值与标签距离，并反向传回网络进行迭代训练，得到卫星姿态估计最优模型，最后将测试集输入至最优模型中，评价模型性能。The technical solution of the present invention, a dual-channel satellite attitude estimation network based on deep learning, is characterized by comprising: making a satellite attitude data set, dividing the data set into a training set, a verification set and a test set, and analyzing the image information. Preprocessing, input the preprocessed training set and validation set to the ResNet model for image feature extraction, use the improved Vision Transformer model to learn satellite spatial position information, use the Hourglass network to learn satellite attitude information, calculate the model predicted value and label distance, and The network is returned to the network for iterative training, and the optimal model for satellite attitude estimation is obtained. Finally, the test set is input into the optimal model to evaluate the performance of the model.

构建卫星姿态旋转数据集，卫星姿态图像数据预处理，通过卷积神经网络提取卫星图像的局部特征，位置编码器对局部特征进行位置编码，非线性残差自注意力机制对编码图像特征进行再次特征提取，通过后处理模块处理特征，最后通过全连接层输出姿态预测信息，该方法的由如下步骤构成。The satellite attitude rotation data set is constructed, the satellite attitude image data is preprocessed, the local features of the satellite image are extracted through the convolutional neural network, the position encoder performs position encoding on the local features, and the non-linear residual self-attention mechanism re-encodes the encoded image features. Feature extraction, processing features through a post-processing module, and finally outputting attitude prediction information through a fully connected layer. The method consists of the following steps.

制作卫星姿态估计图像数据集，可采用仿真图像，并标记图像中卫星的姿态信息并制作为标签。To make a satellite attitude estimation image dataset, a simulated image can be used, and the attitude information of the satellite in the image can be marked and made as a label.

将卫星姿态估计图像数据集按照一定比例划分为训练集、验证集和测试集。The satellite pose estimation image dataset is divided into training set, validation set and test set according to a certain proportion.

对卫星姿态估计图像数据集进行数据预处理，将图像和标签规范化，使其可以输入至网络训练。Data preprocessing is performed on the satellite pose estimation image dataset to normalize the images and labels so that they can be fed into network training.

将经过数据预处理的训练集和验证集输入至ResNet模型中，使用ResNet模型的卷积层、池化层和残差连接等组件对卫星图像特征进行提取，可以使用ResNet-18、ResNet-34或ResNet-50等残差网络。Input the training set and validation set after data preprocessing into the ResNet model, and use the convolutional layer, pooling layer and residual connection of the ResNet model to extract satellite image features. ResNet-18, ResNet-34 can be used Or residual networks like ResNet-50.

使用改进Vision Transformer模型学习卫星空间位置信息，同时使用Hourglass网络学习卫星姿态信息，其中改进Vision Transformer与传统Vision Transformer相比，改进Vision Transformer在处理ResNet模型特征后，首先将四维矩阵压缩抽取为三维矩阵，即将[批次，通道，长度，宽度]压缩抽取为[批次，长度，通道*宽度]，可以有效避免传统Vision Transformer中为将特征输入至自注意力机制中而将图像长度和宽度压缩为一列，破坏物体在图像中的空间位置。The improved Vision Transformer model is used to learn satellite spatial position information, and the Hourglass network is used to learn satellite attitude information. Compared with the traditional Vision Transformer, the improved Vision Transformer first compresses and extracts the four-dimensional matrix into a three-dimensional matrix after processing the ResNet model features. , that is, [batch, channel, length, width] is compressed and extracted into [batch, length, channel*width], which can effectively avoid the traditional Vision Transformer compressing the image length and width for inputting features into the self-attention mechanism. is a column, destroying the spatial position of the object in the image.

网络分别输出卫星的空间位置信息和姿态信息，计算输出信息和标记的姿态信息之间的距离，将距离信息反向传播给网络，进行优化迭代训练，得到最佳的卫星姿态估计模型。The network outputs the satellite's spatial position information and attitude information respectively, calculates the distance between the output information and the marked attitude information, and backpropagates the distance information to the network, performs optimization iterative training, and obtains the best satellite attitude estimation model.

将预处理完的测试集输入至卫星姿态估计模型中，得到测试集的姿态估计信息，与测试集标记的姿态信息进行对比，从而评价卫星姿态估计模型。The preprocessed test set is input into the satellite attitude estimation model, and the attitude estimation information of the test set is obtained, which is compared with the attitude information marked in the test set to evaluate the satellite attitude estimation model.

本发明实现了一种基于深度学习的双通道卫星姿态估计网络，提出一种双通道深度学习网络，首先使用ResNet模型提取图像特征，然后使用双通道网络学习卫星的位置信息和旋转信息，其中使用改进Vision Transformer模型学习卫星空间位置信息，使用Hourglass网络学习卫星姿态信息，对卫星的空间位置信息和自身旋转信息进行有效解耦，有效提高了卫星姿态估计精度。The present invention realizes a dual-channel satellite attitude estimation network based on deep learning, and proposes a dual-channel deep learning network. First, the ResNet model is used to extract image features, and then the dual-channel network is used to learn the position information and rotation information of the satellite. The Vision Transformer model is improved to learn satellite spatial position information, and the Hourglass network is used to learn satellite attitude information, which effectively decouples satellite spatial position information and self-rotation information, effectively improving the accuracy of satellite attitude estimation.

(三)本发明的主要优点(3) The main advantages of the present invention

本发明的上述技术方案具有如下优点：本发明针对卫星姿态信息的特点，进行了相关分析，使用双通道网络成功解耦了卫星的空间位置信息和旋转信息，使用改进VisionTransformer模型学习卫星空间位置信息，使用Hourglass网络学习卫星姿态信息，双通道网络设计及其各通道网络的优点符合卫星姿态信息的特点，有效提供了卫星姿态估计精度。The above technical solution of the present invention has the following advantages: the present invention has carried out a correlation analysis according to the characteristics of the satellite attitude information, successfully decoupled the satellite space position information and rotation information by using a dual-channel network, and used the improved VisionTransformer model to learn the satellite space position information. , using the Hourglass network to learn satellite attitude information, the dual-channel network design and the advantages of each channel network conform to the characteristics of satellite attitude information, effectively providing satellite attitude estimation accuracy.

附图说明Description of drawings

图1是本发明的流程框架图；Fig. 1 is the flow chart of the present invention;

图2是本发明实施例URSO卫星姿态估计数据集部分样例图；2 is a partial sample diagram of a URSO satellite attitude estimation dataset according to an embodiment of the present invention;

具体实施方式Detailed ways

为使本发明的技术方案、优点和目的更加清楚明了，结合具体实例说明了方法流程并参照附图，对本发明的技术方案进一步说明。In order to make the technical solutions, advantages and purposes of the present invention clearer, the method flow is described in conjunction with specific examples and the technical solutions of the present invention are further described with reference to the accompanying drawings.

本发明实施例1，一种基于深度学习的双通道卫星姿态估计网络，参见图1，按下述步骤进行：Embodiment 1 of the present invention, a dual-channel satellite attitude estimation network based on deep learning, referring to FIG. 1, is performed according to the following steps:

制作卫星姿态估计图像数据集，可采用仿真图像，并标记图像中卫星的姿态信息并制作为标签，在制作数据集时，使用Unreal Engine 4软件渲染卫星图片，渲染成逼真的太空环境，也可以使用由加州理工大学学者使用Unreal Engine 4软件制作的URSO卫星姿态估计数据集。To make a satellite attitude estimation image dataset, you can use simulated images, and mark the attitude information of the satellites in the image and make them as labels. When making a dataset, use Unreal Engine 4 software to render satellite images and render them into a realistic space environment, or you can Using the URSO satellite attitude estimation dataset produced by Caltech academics using Unreal Engine 4 software.

将卫星姿态估计图像数据集按照7:2:1的比例划分为训练集、验证集和测试集，其中训练集主要用来训练卫星姿态估计模型，验证集随训练集一起输入至模型，主要用于调整模型超参数，测试集主要是评估训练后的卫星姿态估计模型性能。The satellite attitude estimation image data set is divided into training set, validation set and test set according to the ratio of 7:2:1. The training set is mainly used to train the satellite attitude estimation model, and the validation set is input to the model together with the training set. In order to adjust the model hyperparameters, the test set is mainly used to evaluate the performance of the trained satellite attitude estimation model.

对卫星姿态估计图像数据集进行数据预处理，首先运用图像预处理技术处理卫星图像，将图像和标签处理成可输入深度学习网络的数据类型。Data preprocessing is performed on the satellite attitude estimation image dataset. First, image preprocessing technology is used to process satellite images, and images and labels are processed into data types that can be input into deep learning networks.

将经过数据预处理的训练集和验证集输入至ResNet模型中，使用ResNet模型的卷积层、池化层和残差连接等组件对卫星图像特征进行提取，具体使用ResNet-18提取卫星图像特征，移除原始网络的全局平均池层和最后一个完全连接层用来保持空间特征分辨率。Input the training set and validation set after data preprocessing into the ResNet model, and use the convolutional layer, pooling layer and residual connection of the ResNet model to extract satellite image features, specifically using ResNet-18 to extract satellite image features , remove the global average pooling layer of the original network and the last fully connected layer to preserve the spatial feature resolution.

网络分别输出卫星的空间位置信息和旋转信息，这里需要设计三个损失函数，分别是：①计算卫星的空间位置信息损失函数；②计算卫星本体旋转信息损失函数；③计算总体损失函数，其函数表达式分别为：The network outputs the satellite's spatial position information and rotation information respectively. Three loss functions need to be designed here, namely: ① Calculate the loss function of the satellite's spatial position information; ② Calculate the loss function of the satellite body rotation information; ③ Calculate the overall loss function, whose function The expressions are:

L＝αL_pos+βL_ori(α+β＝1) (3)L=αL _pos +βL _ori (α+β=1) (3)

其中L、L_pos、L_ori分别表示总损失函数、卫星的空间位置信息损失函数和卫星本体旋转信息损失函数，

和

分别是空间位置标签和旋转标签，

和

是模型输出的空间信息和旋转信息。where L, L _pos , and L _ori represent the total loss function, the satellite spatial position information loss function and the satellite body rotation information loss function, respectively,

and

are the spatial position label and the rotation label, respectively,

and

are the spatial and rotational information output by the model.

将三个损失函数反向传播给网络，使用梯度下降法进行优化迭代训练，经过多次迭代训练得到最佳的卫星姿态估计模型。The three loss functions are back-propagated to the network, and the gradient descent method is used for optimization iterative training, and the best satellite attitude estimation model is obtained after multiple iterative training.

本发明说明书中未作详细描述的内容属于本领域专业技术人员公知的现有技术。Contents that are not described in detail in the specification of the present invention belong to the prior art known to those skilled in the art.

Claims

1. A two-channel satellite attitude estimation network based on deep learning is characterized by mainly comprising the following steps:

step 1, constructing a satellite attitude estimation image data set, adopting a simulation image, and marking attitude information of a satellite in the image;

step 2, dividing the data set into a training set, a verification set and a test set according to a certain proportion;

step 3, satellite data preprocessing;

step 4, inputting the processed training set and the processed verification set into a ResNet model, and extracting image characteristics by using components of a convolution layer, a pooling layer, residual connection and the like of the ResNet model;

step 5, learning satellite spatial position information by using an improved Vision Transformer model, and learning satellite rotation information by using a Hourglass network;

step 6, the network respectively outputs the space position information and the attitude information of the satellite, calculates the distance between the output information and the marked attitude information, reversely transmits the distance information to the network, and performs optimization iterative training to obtain an optimal satellite attitude estimation model;

and 7, inputting the preprocessed test set into the satellite attitude estimation model to obtain attitude estimation information of the test set, and comparing the attitude estimation information with the attitude information marked by the test set, thereby evaluating the satellite attitude estimation model.

2. The deep learning based two-channel satellite attitude estimation network of claim 1, wherein:

in step 3, the improved Vision transform model is used to learn the satellite spatial position information, wherein the improved Vision transform is compared with the traditional Vision transform, after processing the characteristics of the ResNet model, the improved Vision transform firstly compresses and extracts the four-dimensional matrix into a three-dimensional matrix, namely [ batch, channel, length, width ] into [ batch, length, channel width ] so as to effectively avoid the problem that the traditional Vision transform compresses the image length and width into a column to input the characteristics into the self-attention mechanism, thereby destroying the spatial position of the object in the image.