CN111401455B

CN111401455B - Remote sensing image deep learning classification method and system based on Capsules-Unet model

Info

Publication number: CN111401455B
Application number: CN202010199056.XA
Authority: CN
Inventors: 廖静娟; 郭宇娟
Original assignee: Aerospace Information Research Institute of CAS
Current assignee: Aerospace Information Research Institute of CAS
Priority date: 2020-03-20
Filing date: 2020-03-20
Publication date: 2023-04-18
Anticipated expiration: 2040-03-20
Also published as: CN111401455A

Abstract

The invention discloses a remote sensing image deep learning classification method and system based on the Capsules-Unet model, comprising: performing data preprocessing on remote sensing image data, and dividing the preprocessed remote sensing image data into training set data and verification set data; Take the Unet model as the basic network architecture, fuse the capsule (Capsules) model, and set up the Capsules-Unet model; use the training set data and the verification set data to train the Capsules-Unet model, and obtain the trained Capsules -Unet model; using the trained Capsules-Unet model to classify the remote sensing image data to be classified. The invention establishes a remote sensing image deep learning classification Capsules-Unet model capable of encapsulating multi-dimensional features of ground objects, improves the dynamic routing algorithm of the existing capsule model, and classifies high-resolution remote sensing images more accurately and efficiently.

Description

A remote sensing image deep learning classification method and system based on Capsules-Unet model

技术领域Technical Field

本发明涉及遥感图像分类领域，具体地，涉及一种基于Capsules-Unet模型的遥感图像深度学习分类方法和系统。The present invention relates to the field of remote sensing image classification, and in particular to a remote sensing image deep learning classification method and system based on a Capsules-Unet model.

背景技术Background Art

分类是遥感领域的一个基本问题，以深度卷积神经网络(Convolutional NeuralNetworks，CNN)为代表的基础模型的影像分类已成为主要趋势。相较于传统的遥感图像分类方法，深度卷积神经网络不需要手工制作特征。它通常由多个连续的层组成，这些层可以从海量数据中自动学习极其复杂的层次特征，从而避免了高度依赖专家知识设计特征的问题。常见的卷积神经网络如全卷积网络(FCN)、U-net、生成对抗网络(GANs)等跨连型神经网络已经成为各种图像分类任务的理想模型，在遥感应用中显示出巨大的潜力。例如，在城市复杂地物类型的分类中，利用卷积神经网络自动提取多尺度特征，其分类精度可达到90％以上。针对卷积神经网络上采样方式不够精细的问题改进模型结构，并通过数据增强、多尺度融合、后处理(CRF、投票等)、附加特征(高程信息、植被指数、光谱特征)等优化方式极大的提高模型对地物对象的区分能力。Classification is a basic problem in the field of remote sensing. Image classification represented by deep convolutional neural networks (CNN) has become a major trend. Compared with traditional remote sensing image classification methods, deep convolutional neural networks do not require hand-crafted features. It is usually composed of multiple consecutive layers that can automatically learn extremely complex hierarchical features from massive data, thus avoiding the problem of highly dependent on expert knowledge to design features. Common convolutional neural networks such as fully convolutional networks (FCN), U-net, generative adversarial networks (GANs) and other cross-connected neural networks have become ideal models for various image classification tasks and have shown great potential in remote sensing applications. For example, in the classification of complex urban landforms, the classification accuracy can reach more than 90% by automatically extracting multi-scale features using convolutional neural networks. The model structure is improved to address the problem that the upsampling method of convolutional neural networks is not fine enough, and the model's ability to distinguish land objects is greatly improved through optimization methods such as data enhancement, multi-scale fusion, post-processing (CRF, voting, etc.), and additional features (elevation information, vegetation index, spectral features).

尽管卷积神经网络取得了巨大成功，但在遥感分类方面仍存在一些具有挑战性的问题。主要原因如下：Despite the great success of convolutional neural networks, there are still some challenging problems in remote sensing classification. The main reasons are as follows:

(1)遥感图像与自然图像相比更为复杂。遥感图像包含各种类型的物体，这些物体在大小、颜色、位置和方向等方面差异巨大。光谱特性本身可能不足以区分物体，还需要基于空间位置等识别特征。因此，如何将丰富的光谱信息和空间信息结合起来作为互补线索，在遥感分类中显著提高深度学习的性能是当前研究的热点。(1) Remote sensing images are more complex than natural images. Remote sensing images contain various types of objects that vary greatly in size, color, position, and orientation. Spectral characteristics alone may not be sufficient to distinguish objects, and identification features such as spatial position are also required. Therefore, how to combine rich spectral information and spatial information as complementary clues to significantly improve the performance of deep learning in remote sensing classification is a hot topic in current research.

(2)在遥感应用中往往缺少大量的有标记数据集，这不仅涉及数据集的数量，而且在于定义数据集中的类别标签存在困难。目前复杂的深层网络需要配置大量的超参数，使得整个网络过于复杂而无法优化。因此，用少量的数据集训练深层神经网络必然会出现过拟合现象。(2) In remote sensing applications, there is often a lack of a large number of labeled data sets. This is not only related to the number of data sets, but also to the difficulty in defining the category labels in the data sets. Currently, complex deep networks require the configuration of a large number of hyperparameters, making the entire network too complex to be optimized. Therefore, training deep neural networks with a small number of data sets will inevitably lead to overfitting.

(3)目前“端到端”的学习策略使深度学习在分类任务中的优秀表现难以解释。除了最终的网络输出，很难理解隐藏在网络内部的预测逻辑。这就给遥感图像分类的进一步挖掘与处理带来了困难。(3) The current “end-to-end” learning strategy makes it difficult to explain the excellent performance of deep learning in classification tasks. Apart from the final network output, it is difficult to understand the prediction logic hidden inside the network. This brings difficulties to the further mining and processing of remote sensing image classification.

考虑到以上几点，研究人员提出了许多应对这些挑战的办法。最近，Sabour等人设计了新的神经元，即“胶囊”，以取代传统的标量神经元来构建胶囊网络(CapsNet)。胶囊是一个封装多个神经元的载体。它的输出是一个高维向量，可以表示出实体的各种属性信息，如姿态、光照和形变等。利用向量的模长表示实体出现的概率，该高维向量的模值越大，表示该实体存在的可能性越大。利用向量可以表示地物的方向和空间上的相对关系，极大地弥补了卷积神经网络存在的不足。胶囊网络的训练算法主要是网络连续层胶囊之间的动态路由机制。动态路由机制可以提高模型训练和预测的稳健性，同时降低所需的样本数量。Considering the above points, researchers have proposed many ways to deal with these challenges. Recently, Sabour et al. designed a new neuron, namely "capsule", to replace the traditional scalar neuron to construct a capsule network (CapsNet). A capsule is a carrier that encapsulates multiple neurons. Its output is a high-dimensional vector that can represent various attribute information of the entity, such as posture, lighting, and deformation. The modulus of the vector is used to represent the probability of the entity appearing. The larger the modulus of the high-dimensional vector, the greater the possibility of the entity's existence. The vector can be used to represent the direction and relative relationship of the object in space, which greatly makes up for the shortcomings of the convolutional neural network. The training algorithm of the capsule network is mainly a dynamic routing mechanism between capsules in consecutive layers of the network. The dynamic routing mechanism can improve the robustness of model training and prediction while reducing the number of samples required.

发明内容Summary of the invention

本发明所要解决的技术问题是建立一种能够封装地物多维特征的遥感图像深度学习分类模型以更准确地对高分辨率遥感图像进行分类；改进现有的胶囊模型的动态路由算法，降低其运算内存负担和减少数据参数，以更高效地对高分辨率遥感图像进行分类。The technical problem to be solved by the present invention is to establish a deep learning classification model for remote sensing images that can encapsulate the multi-dimensional features of ground objects so as to more accurately classify high-resolution remote sensing images; to improve the dynamic routing algorithm of the existing capsule model, reduce its computational memory burden and reduce data parameters, so as to more efficiently classify high-resolution remote sensing images.

根据本发明的一个方面，提供一种基于Capsules-Unet模型的遥感图像深度学习分类方法，包括：步骤S1:对遥感图像数据进行数据预处理，将预处理后的遥感图像数据分为训练集数据和验证集数据；步骤S2:以Unet模型为基本网络架构，融合胶囊(Capsules)模型，建立Capsules-Unet模型；步骤S3:利用所述训练集数据和所述验证集数据对所述Capsules-Unet模型进行训练，得到训练好的所述Capsules-Unet模型；步骤S4:利用训练好的所述Capsules-Unet模型对待分类的遥感图像数据进行分类。According to one aspect of the present invention, a deep learning classification method for remote sensing images based on a Capsules-Unet model is provided, comprising: step S1: performing data preprocessing on remote sensing image data, and dividing the preprocessed remote sensing image data into training set data and verification set data; step S2: using the Unet model as the basic network architecture, integrating the capsule model, and establishing a Capsules-Unet model; step S3: using the training set data and the verification set data to train the Capsules-Unet model to obtain the trained Capsules-Unet model; step S4: using the trained Capsules-Unet model to classify the remote sensing image data to be classified.

可选地，在步骤S3中还包括采用所述验证集数据对所述Capsules-Unet模型进行验证的步骤，包括设定误差小于给定阈值或满足最大迭代次数时，训练迭代停止，完成对所述Capsules-Unet模型的训练。Optionally, step S3 also includes a step of using the verification set data to verify the Capsules-Unet model, including setting the training iteration to stop when the error is less than a given threshold or the maximum number of iterations is met, thereby completing the training of the Capsules-Unet model.

可选地，所述Capsules-Unet模型包括：特征提取模块，其包括输入卷积层和卷积胶囊层(ConCaps)，所述输入卷积层用于提取输入的遥感图像的低级特征；所述卷积胶囊层(ConCaps)用于将所述卷积层提取的所述低级特征经过卷积滤波处理并转化为胶囊；收缩路径模块，其包括多个主胶囊层(PrimaryCaps)，用于对所述特征提取模块得到的胶囊进行降采样处理；扩展路径模块，其包括多个主胶囊层(PrimaryCaps)和多个反卷积胶囊层(DeconCaps),该多个主胶囊层(PrimaryCaps)和多个反卷积胶囊层(DeconCaps)彼此交错构成，用于将来自所述收缩路径模块的胶囊进行上采样处理；还包括一个输出主胶囊层，用于将所述扩展路径模块中上采样处理获得的数据进行卷积处理并输出到分类模块；跳跃连接层(skip layer)，所述扩展路径模块通过跳跃连接层裁剪并复制所述收缩路径模块中的低级特征，用于在所述扩展路径模块中进行上采样处理；分类模块，其包括分类胶囊层(Class Capsule),所述分类胶囊层包括多个胶囊，该多个胶囊中的每个胶囊的激活向量模长用于计算每个类的实例是否存在的概率。Optionally, the Capsules-Unet model includes: a feature extraction module, which includes an input convolution layer and a convolution capsule layer (ConCaps), the input convolution layer is used to extract low-level features of the input remote sensing image; the convolution capsule layer (ConCaps) is used to convert the low-level features extracted by the convolution layer into capsules through convolution filtering; a contraction path module, which includes multiple primary capsule layers (PrimaryCaps) for downsampling the capsules obtained by the feature extraction module; an expansion path module, which includes multiple primary capsule layers (PrimaryCaps) and multiple deconvolution capsule layers (DeconCaps), the multiple primary capsule layers (PrimaryCaps) and the multiple deconvolution capsule layers (DeconCaps) are interlaced with each other, and are used to upsample the capsules from the contraction path module; and also includes an output primary capsule layer for convolution processing the data obtained by the upsampling processing in the expansion path module and outputting it to the classification module; a skip connection layer (skip connection layer); layer), the expansion path module cuts and copies the low-level features in the contraction path module through a skip connection layer for upsampling in the expansion path module; a classification module, which includes a classification capsule layer (Class Capsule), the classification capsule layer includes a plurality of capsules, and the activation vector modulus of each capsule in the plurality of capsules is used to calculate the probability of the existence of an instance of each class.

可选地，步骤S3和步骤S4中，所述Capsules-Unet模型中采用改进的局部约束动态路由算法，使得子胶囊的数据路由到下一层父胶囊的数据时，对于同一类型的子胶囊和父胶囊，采用相同的转换矩阵。Optionally, in step S3 and step S4, an improved local constrained dynamic routing algorithm is adopted in the Capsules-Unet model, so that when the data of the child capsule is routed to the data of the parent capsule of the next layer, the same transformation matrix is adopted for the child capsule and the parent capsule of the same type.

可选地，步骤S3和步骤S4中，所述Capsules-Unet模型中采用改进的局部约束动态路由算法，使得子胶囊的数据路由到下一层父胶囊的数据时，子胶囊仅在一个定义的本地窗口路由到父胶囊。Optionally, in step S3 and step S4, an improved local constrained dynamic routing algorithm is adopted in the Capsules-Unet model, so that when the data of a child capsule is routed to the data of a parent capsule of the next layer, the child capsule is routed to the parent capsule only in a defined local window.

根据本发明的另一方面，提供一种基于Capsules-Unet模型的遥感图像深度学习分类系统，包括：数据预处理单元，用于对遥感图像数据进行数据预处理，将预处理后的遥感图像数据分为训练集数据和验证集数据；模型建立单元:以Unet模型为基本网络架构，融合胶囊(Capsules)模型，建立Capsules-Unet模型；模型训练单元:利用所述训练集数据和所述验证集数据对所述Capsules-Unet模型进行训练，得到训练好的所述Capsules-Unet模型；分类单元:利用训练好的所述Capsules-Unet模型对待分类的遥感图像数据进行分类。According to another aspect of the present invention, a remote sensing image deep learning classification system based on a Capsules-Unet model is provided, comprising: a data preprocessing unit, used to perform data preprocessing on remote sensing image data, and divide the preprocessed remote sensing image data into training set data and verification set data; a model building unit: using the Unet model as the basic network architecture, integrating the capsule (Capsules) model, and building a Capsules-Unet model; a model training unit: using the training set data and the verification set data to train the Capsules-Unet model to obtain the trained Capsules-Unet model; a classification unit: using the trained Capsules-Unet model to classify the remote sensing image data to be classified.

可选地，在所述模型训练单元中还包括模型验证子单元，用于采用所述验证集数据对所述Capsules-Unet模型进行验证,包括设定误差小于给定阈值或满足最大迭代次数时，训练迭代停止，完成对所述Capsules-Unet模型的训练。Optionally, the model training unit also includes a model verification subunit, which is used to verify the Capsules-Unet model using the verification set data, including setting the training iteration to stop when the error is less than a given threshold or the maximum number of iterations is met, thereby completing the training of the Capsules-Unet model.

可选地，上述系统中的Capsules-Unet模型包括：特征提取模块，其包括输入卷积层和卷积胶囊层(ConCaps)，所述输入卷积层用于提取输入的遥感图像的低级特征；所述卷积胶囊层(ConCaps)用于将所述卷积层提取的所述低级特征经过卷积滤波处理并转化为胶囊；收缩路径模块，其包括多个主胶囊层(PrimaryCaps)，用于对所述特征提取模块得到的胶囊进行降采样处理；扩展路径模块，其包括多个主胶囊层(PrimaryCaps)和多个反卷积胶囊层(DeconCaps),该多个主胶囊层(PrimaryCaps)和多个反卷积胶囊层(DeconCaps)彼此交错构成，用于将来自所述收缩路径模块的胶囊进行上采样处理；还包括一个输出主胶囊层，用于将所述扩展路径模块中上采样处理获得的数据进行卷积处理并输出到分类模块；跳跃连接层(skip layer)，所述扩展路径模块通过跳跃连接层裁剪并复制所述收缩路径模块中的低级特征，用于在所述扩展路径模块中进行上采样处理；分类模块，其包括分类胶囊层(Class Capsule),所述分类胶囊层包括多个胶囊，该多个胶囊中的每个胶囊的激活向量模长用于计算每个类的实例是否存在的概率。Optionally, the Capsules-Unet model in the above system includes: a feature extraction module, which includes an input convolution layer and a convolution capsule layer (ConCaps), wherein the input convolution layer is used to extract low-level features of the input remote sensing image; the convolution capsule layer (ConCaps) is used to convert the low-level features extracted by the convolution layer into capsules through convolution filtering; a contraction path module, which includes multiple primary capsule layers (PrimaryCaps) for downsampling the capsules obtained by the feature extraction module; an expansion path module, which includes multiple primary capsule layers (PrimaryCaps) and multiple deconvolution capsule layers (DeconCaps), wherein the multiple primary capsule layers (PrimaryCaps) and the multiple deconvolution capsule layers (DeconCaps) are interlaced with each other and are used to upsample the capsules from the contraction path module; and also includes an output primary capsule layer for convolutionally processing the data obtained by the upsampling processing in the expansion path module and outputting it to the classification module; a skip connection layer (skip connection layer); layer), the expansion path module cuts and copies the low-level features in the contraction path module through a skip connection layer for upsampling in the expansion path module; a classification module, which includes a classification capsule layer (Class Capsule), the classification capsule layer includes a plurality of capsules, and the activation vector modulus of each capsule in the plurality of capsules is used to calculate the probability of the existence of an instance of each class.

可选地，在所述模型训练单元和所述分类单元中，所述Capsules-Unet模型中采用改进的局部约束动态路由算法，使得子胶囊的数据路由到下一层父胶囊的数据时，对于同一类型的子胶囊和父胶囊，采用相同的转换矩阵。Optionally, in the model training unit and the classification unit, an improved local constrained dynamic routing algorithm is adopted in the Capsules-Unet model, so that when the data of the child capsule is routed to the data of the parent capsule of the next layer, the same transformation matrix is adopted for the child capsule and the parent capsule of the same type.

可选地，在所述模型训练单元和所述分类单元中，所述Capsules-Unet模型中采用改进的局部约束动态路由算法，使得子胶囊的数据路由到下一层父胶囊的数据时，子胶囊仅在一个定义的本地窗口路由到父胶囊。Optionally, in the model training unit and the classification unit, an improved local constrained dynamic routing algorithm is adopted in the Capsules-Unet model, so that when the data of a child capsule is routed to the data of a parent capsule of the next layer, the child capsule is routed to the parent capsule only in a defined local window.

本发明的技术方案的有益技术效果是：建立了能够封装地物多维特征的遥感图像深度学习分类Capsules-Unet模型，更准确地对高分辨率遥感图像进行分类；改进了现有的胶囊模型的动态路由算法，降低其运算内存负担和减少数据参数，更高效地对高分辨率遥感图像进行分类。The beneficial technical effects of the technical solution of the present invention are: establishing a Capsules-Unet model for deep learning classification of remote sensing images that can encapsulate the multi-dimensional features of ground objects, and more accurately classifying high-resolution remote sensing images; improving the dynamic routing algorithm of the existing capsule model, reducing its computational memory burden and reducing data parameters, and more efficiently classifying high-resolution remote sensing images.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

为了更清楚地说明本发明具体实施方式或现有技术中的技术方案，下面将对具体实施方式或现有技术描述中所需要使用的附图作简单地介绍，显而易见地，下面描述中的附图是本发明的一些实施方式，对于本领域普通技术人员来讲，在不付出创造性劳动的前提下，还可以根据这些附图获得其他的附图。In order to more clearly illustrate the specific implementation methods of the present invention or the technical solutions in the prior art, the drawings required for use in the specific implementation methods or the description of the prior art will be briefly introduced below. Obviously, the drawings described below are some implementation methods of the present invention. For ordinary technicians in this field, other drawings can be obtained based on these drawings without paying creative work.

图1是根据本发明的遥感图像深度学习分类方法的流程图。FIG1 is a flow chart of a remote sensing image deep learning classification method according to the present invention.

图2是根据本发明的一个实施例的遥感图像深度学习分类方法的流程图。FIG2 is a flowchart of a remote sensing image deep learning classification method according to an embodiment of the present invention.

图3是本发明的Capsules-Unet模型结构示意图。FIG3 is a schematic diagram of the Capsules-Unet model structure of the present invention.

图4是本发明的图3的Capsules-Unet模型结构的进一步的详细结构图。FIG. 4 is a further detailed structural diagram of the Capsules-Unet model structure of FIG. 3 of the present invention.

图5是根据本发明的遥感图像深度学习分类系统的结构示意图。FIG5 is a schematic diagram of the structure of a remote sensing image deep learning classification system according to the present invention.

图6是本发明的Capsules-Unet模型的改进的局部约束动态路由算法的原理图。FIG6 is a schematic diagram of an improved local constraint dynamic routing algorithm of the Capsules-Unet model of the present invention.

图7是本发明的Capsules-Unet模型的改进的局部约束动态路由算法更新耦合系数过程的原理图。FIG. 7 is a schematic diagram of the process of updating the coupling coefficient of the improved local constraint dynamic routing algorithm of the Capsules-Unet model of the present invention.

具体实施方式DETAILED DESCRIPTION

这里将详细地对示例性实施例进行说明，其示例表示在附图中。下面的描述涉及附图时，除非另有表示，不同附图中的相同数字表示相同或相似的要素。以下示例性实施例中所描述的实施方式并不代表与本发明相一致的所有实施方式。相反，它们仅是与如所附权利要求书中所详述的、本发明的一些方面相一致的系统的例子。Exemplary embodiments will be described in detail herein, examples of which are shown in the accompanying drawings. When the following description refers to the drawings, the same numbers in different drawings represent the same or similar elements unless otherwise indicated. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with the present invention. Instead, they are merely examples of systems consistent with some aspects of the present invention as detailed in the appended claims.

如图1所示，根据本发明的一个方面，本发明提供一种基于Capsules-Unet模型的遥感图像深度学习分类方法，包括：步骤S1:对遥感图像数据进行数据预处理，将预处理后的遥感图像数据分为训练集数据和验证集数据；步骤S2:以Unet模型为基本网络架构，融合胶囊(Capsules)模型，建立Capsules-Unet模型；步骤S3:利用所述训练集数据和所述验证集数据对所述Capsules-Unet模型进行训练，得到训练好的所述Capsules-Unet模型；步骤S4:利用训练好的所述Capsules-Unet模型对待分类的遥感图像数据进行分类。As shown in Figure 1, according to one aspect of the present invention, the present invention provides a remote sensing image deep learning classification method based on the Capsules-Unet model, including: step S1: performing data preprocessing on the remote sensing image data, and dividing the preprocessed remote sensing image data into training set data and verification set data; step S2: using the Unet model as the basic network architecture, integrating the capsule (Capsules) model, and establishing the Capsules-Unet model; step S3: using the training set data and the verification set data to train the Capsules-Unet model to obtain the trained Capsules-Unet model; step S4: using the trained Capsules-Unet model to classify the remote sensing image data to be classified.

数据预处理Data preprocessing

本发明的高分辨率遥感图像数据来自ISPRS 2D semantic labeling的Vaihingen和Potsdam数据集。ISPRS Vaihingen数据集包含33张不同尺寸的正射校正影像和地面真实数据，均取自德国Vaihingen地区，其地面采样间隔为9cm，由近红外、红外、绿色3个通道(IRRG)构成。数据中共包含6中地物类别，分别是不透水层、建筑物、低矮植被、树木、车辆、其他(背景)。模型训练期间将具有真实标签数据的影像划分为训练集和验证集，选取其中31张影像作为训练集，剩余2张作为验证集。The high-resolution remote sensing image data of the present invention comes from the Vaihingen and Potsdam data sets of ISPRS 2D semantic labeling. The ISPRS Vaihingen data set contains 33 orthorectified images of different sizes and ground truth data, all of which are taken from the Vaihingen area of Germany. Its ground sampling interval is 9 cm and it consists of three channels (IRRG) of near infrared, infrared and green. There are 6 types of ground objects in the data, namely impermeable layer, building, low vegetation, tree, vehicle and others (background). During model training, the images with real label data are divided into training set and verification set, 31 images are selected as training set and the remaining 2 images are selected as verification set.

Potsdam数据集包含38幅不同尺寸的正射校正像影像和相应的数字表面模型(DSM)。DSM是一个与输入图像大小相同的数组，在每个像素处提供一个高程值。影像取自德国Potsdam地区，其地面采样间隔为5cm，由近红外、红外、绿色和蓝色4个通道(IRRGB)构成。在实验中，为了增加Potsdam数据集中的波段数，利用红外光波段计算NDVI指数。数据集提供了24张影像的真实标签数据用于模型训练与验证。Potsdam数据集与Vaihingen数据集具有相同的地物类别。模型训练期间选取其中23张影像作为训练集，剩余1张作为验证集。两个数据集中标记数据划分为训练样本和测试样本的影像编号如表1所示。The Potsdam dataset contains 38 orthorectified images of different sizes and the corresponding digital surface models (DSM). The DSM is an array of the same size as the input image, providing an elevation value at each pixel. The images were taken in the Potsdam area of Germany, with a ground sampling interval of 5 cm and composed of four channels: near infrared, infrared, green, and blue (IRRGB). In the experiment, in order to increase the number of bands in the Potsdam dataset, the NDVI index was calculated using the infrared light band. The dataset provides real labeled data of 24 images for model training and validation. The Potsdam dataset has the same ground object categories as the Vaihingen dataset. During model training, 23 images were selected as the training set, and the remaining 1 was used as the validation set. The image numbers of the labeled data divided into training samples and test samples in the two datasets are shown in Table 1.

表1.标记数据划分为训练样本和测试样本的影像编号Table 1. Image numbers of labeled data divided into training samples and test samples

参见图2。在数据预处理步骤，采用随机滑动窗口的采样方式将原始影像采样为64×64像素大小。样本库中的样本80％作为为训练样本，20％作为检验样本。为了增加样本的多样性和变异性，模型通过归一化、随机采样和数据增强将大量标记好的训练数据输入模型。See Figure 2. In the data preprocessing step, the original image is sampled into 64×64 pixels using a random sliding window sampling method. 80% of the samples in the sample library are used as training samples, and 20% are used as test samples. In order to increase the diversity and variability of the samples, the model inputs a large amount of labeled training data into the model through normalization, random sampling and data augmentation.

Capsules-Unet模型建立Capsules-Unet model establishment

本发明是基于Sabour等人设计的新的神经元，即“胶囊”，以取代传统的标量神经元来构建胶囊网络。胶囊网络由多层胶囊组成。每个胶囊包含两个成分：权重和耦合系数。每个权重矩阵W表示一个线性变换，承载着低层特征与高层特征之间的空间关系及其他重要关系，例如姿态(位置、大小、方向)、变形、速度、色相、纹理等。耦合系数c_ij决定着一个低层胶囊的输出送往哪个高层胶囊。与权重不同的是，c_ij是通过动态路由的方式来更新。因此，耦合系数本质上决定了信息如何在胶囊之间流动。每个低层胶囊和所有潜在高层胶囊之间的c_ij之和为1。胶囊网络建立在胶囊的基础上，旨在克服传统CNNs不能识别实体的姿态信息和物体之间缺乏部分-整体关系的缺点。The present invention is based on a new neuron, namely "capsule", designed by Sabour et al. to replace the traditional scalar neuron to construct a capsule network. The capsule network consists of multiple layers of capsules. Each capsule contains two components: weight and coupling coefficient. Each weight matrix W represents a linear transformation, carrying the spatial relationship and other important relationships between low-level features and high-level features, such as posture (position, size, direction), deformation, speed, hue, texture, etc. The coupling coefficient c _ij determines which high-level capsule the output of a low-level capsule is sent to. Unlike the weight, c _ij is updated by dynamic routing. Therefore, the coupling coefficient essentially determines how information flows between capsules. The sum of c _ij between each low-level capsule and all potential high-level capsules is 1. The capsule network is built on the basis of capsules and aims to overcome the shortcomings of traditional CNNs that cannot recognize the posture information of entities and the lack of part-whole relationship between objects.

本发明提出了一种以U-net模型为基本架构的高空间分辨率图像分类模型Capsules-Unet。本发明利用胶囊的概念和结构设计一个分类模型，以期通过胶囊模型的视点不变性和互操作机制提高分类性能。The present invention proposes a high spatial resolution image classification model Capsules-Unet based on the U-net model. The present invention uses the concept and structure of capsules to design a classification model, in order to improve the classification performance through the viewpoint invariance and interoperability mechanism of the capsule model.

参见图3和图4，本发明的Capsules-Unet模型，包括：特征提取模块1，其包括输入卷积层11和卷积胶囊层(ConCaps)12，所述输入卷积层11用于提取输入的遥感图像的低级特征；所述卷积胶囊层(ConCaps)12用于将所述卷积层11提取的所述低级特征经过卷积滤波处理并转化为胶囊；收缩路径模块2，其包括多个主胶囊层21(PrimaryCaps)，用于对所述特征提取模块1得到的胶囊进行降采样处理；扩展路径模块3，其包括多个主胶囊层(PrimaryCaps)31和多个反卷积胶囊层(DeconCaps)32,该多个主胶囊层(PrimaryCaps)和多个反卷积胶囊层(DeconCaps)彼此交错构成，用于将来自所述收缩路径模块的胶囊进行上采样处理；还包括一个输出主胶囊层33，用于将所述扩展路径模块3中上采样处理获得的数据进行卷积处理并输出到分类模块5；跳跃连接层(skip layer)4，所述扩展路径模块3通过跳跃连接层4裁剪并复制所述收缩路径模块2中的低级特征，用于在所述扩展路径模块3中进行上采样处理；分类模块5，其包括分类胶囊层(Class Capsule)52,所述分类胶囊层52包括多个胶囊，该多个胶囊中的每个胶囊的激活向量模长用于计算每个类的实例是否存在的概率。Referring to FIG. 3 and FIG. 4 , the Capsules-Unet model of the present invention comprises: a feature extraction module 1, which comprises an input convolution layer 11 and a convolution capsule layer (ConCaps) 12, wherein the input convolution layer 11 is used to extract low-level features of an input remote sensing image; the convolution capsule layer (ConCaps) 12 is used to convert the low-level features extracted by the convolution layer 11 into capsules through convolution filtering; a contraction path module 2, which comprises a plurality of primary capsule layers 21 (PrimaryCaps) for performing convolution filtering on the capsules obtained by the feature extraction module 1. downsampling processing; an expansion path module 3, which includes a plurality of primary capsule layers (PrimaryCaps) 31 and a plurality of deconvolution capsule layers (DeconCaps) 32, the plurality of primary capsule layers (PrimaryCaps) and the plurality of deconvolution capsule layers (DeconCaps) are staggered to form an upsampling processing on the capsules from the contraction path module; it also includes an output primary capsule layer 33, which is used to convolve the data obtained by the upsampling processing in the expansion path module 3 and output it to the classification module 5; a skip connection layer (skip layer) 4, the expansion path module 3 cuts and copies the low-level features in the contraction path module 2 through the skip connection layer 4, and is used for upsampling processing in the expansion path module 3; a classification module 5, which includes a classification capsule layer (Class Capsule) 52, the classification capsule layer 52 includes a plurality of capsules, and the activation vector modulus of each capsule in the plurality of capsules is used to calculate the probability of the existence of an instance of each class.

根据本发明的一个可选实施方式，输入卷积层11包含有16个步长为1的5×5卷积核，其输入为64×64×3的图像，输出为64×64×1×16的张量。主胶囊层(PrimaryCaps)21，31的许多方面的功能类似于CNN的卷积层。然而，它们以胶囊作为输入，并利用局部约束动态路由算法来判断输出，输出结果也是胶囊。反卷积胶囊层(DeconCaps)32使用转置卷积进行操作，以弥补局部约束动态路由带来的全局连通性损失。分类胶囊层(Class Capsule)52可以包括k个胶囊(对应k个分类类别)。每个胶囊的激活向量模长计算每个类的实例是否存在的概率。前一层中的每个胶囊都与该层中的胶囊完全连接。According to an optional embodiment of the present invention, the input convolution layer 11 includes 16 5×5 convolution kernels with a stride of 1, and its input is a 64×64×3 image and its output is a 64×64×1×16 tensor. The functions of the primary capsule layers (PrimaryCaps) 21, 31 are similar to the convolution layers of CNN in many aspects. However, they take capsules as input and use a local constrained dynamic routing algorithm to determine the output, and the output result is also a capsule. The deconvolution capsule layer (DeconCaps) 32 uses transposed convolution to compensate for the global connectivity loss caused by local constrained dynamic routing. The classification capsule layer (Class Capsule) 52 may include k capsules (corresponding to k classification categories). The activation vector modulus of each capsule calculates the probability of the existence of an instance of each class. Each capsule in the previous layer is fully connected to the capsule in this layer.

如图4所示，输入为64×64的多波段遥感图像。首先，由特征提取模块1的输入卷积层11开始，使用16个大小为5x5，步长为1的卷积核，输出为64×64×1×16的张量。该输入卷积层11产生16个具有相同空间维度的特征图。经过2个步长为2的5×5的卷积胶囊层12的胶囊卷积核将输入卷积层11检测到的16个基本特征，生成第一组特征组合的胶囊输出，输出大小为32×32×2×16。特征提取模块1可以捕捉输入数据的差异特征，并将其输入到后面的模块中。接着，以Unet模型基本架构，其体系结构由收缩路径模块2和扩展路径模块3组成。收缩路径模块2由4个连续的主胶囊层(PrimaryCaps)21对图像进行降采样操作。即进行步长为1的5×5×4×16(4个胶囊，16个特征图)的胶囊卷积+步长为2的5×5×4×16(4个胶囊，16个特征图)的胶囊卷积+步长为1的5×5×8×16(8个胶囊，16个特征图)的胶囊卷积+步长为2的5×5×8×32(8个胶囊，32个特征图)的胶囊卷积。达到最底层时图像的大小为8×8×8×32。通过在Unet中加入胶囊层而不是卷积层，保留了部分-整体的上下文信息。扩展路径模块3由3个主胶囊层(PrimaryCaps)31和3个反卷积胶囊层(DeconCaps)32交替构成，还包括一个输出主胶囊层33。即进行步长为1的5×5×8×32(8个胶囊，32个特征图)的胶囊卷积+4×4×8×32(8个胶囊，32个特征图)的反卷积胶囊+步长为1的5×5×4×32(4个胶囊，32个特征图)的胶囊卷积+4×4×4×16(4个胶囊，16个特征图)的反卷积胶囊+步长为1的5×5×4×16(4个胶囊，16个特征图)的胶囊卷积+4×4×2×16(2个胶囊，16个特征图)的反卷积胶囊。达到最上层时即第3次反卷积之后，图像变为64×64×2×16的大小，然后通过输出主胶囊33将扩展路径模块3中进行上采样处理后的数据进行卷积处理并输出到分类模块5。通过跳跃连接层(skip layer)4裁剪并复制收缩路径模块2中的特征，以便在扩展路径模块3中进行相应的上采样。分类模块5，其包括分类胶囊层(Class Capsule)52,所述输出主胶囊层33将扩展路径模块3中进行上采样处理后的数据进行卷积处理并输出到分类胶囊层52；所述分类胶囊层52包括多个胶囊，该多个胶囊中的每个胶囊的激活向量模长用于计算每个类的实例是否存在的概率。As shown in Figure 4, the input is a 64×64 multi-band remote sensing image. First, starting from the input convolution layer 11 of the feature extraction module 1, 16 convolution kernels of size 5x5 and stride 1 are used, and the output is a tensor of 64×64×1×16. The input convolution layer 11 produces 16 feature maps with the same spatial dimensions. The 16 basic features detected by the input convolution layer 11 are converted into the capsule output of the first set of feature combinations through the capsule convolution kernels of the 2 5×5 convolution capsule layers 12 with a stride of 2, and the output size is 32×32×2×16. The feature extraction module 1 can capture the difference features of the input data and input them into the subsequent modules. Next, with the basic architecture of the Unet model, its architecture consists of a contraction path module 2 and an expansion path module 3. The contraction path module 2 consists of 4 consecutive primary capsule layers (PrimaryCaps) 21 to downsample the image. That is, capsule convolution of 5×5×4×16 (4 capsules, 16 feature maps) with a stride of 1 + capsule convolution of 5×5×4×16 (4 capsules, 16 feature maps) with a stride of 2 + capsule convolution of 5×5×8×16 (8 capsules, 16 feature maps) with a stride of 1 + capsule convolution of 5×5×8×32 (8 capsules, 32 feature maps) with a stride of 2. When reaching the bottom layer, the image size is 8×8×8×32. By adding capsule layers instead of convolutional layers to Unet, part-whole context information is retained. The extended path module 3 is composed of 3 primary capsule layers (PrimaryCaps) 31 and 3 deconvolution capsule layers (DeconCaps) 32 alternately, and also includes an output primary capsule layer 33. That is, capsule convolution of 5×5×8×32 (8 capsules, 32 feature maps) with a step size of 1 + deconvolution capsule of 4×4×8×32 (8 capsules, 32 feature maps) + capsule convolution of 5×5×4×32 (4 capsules, 32 feature maps) with a step size of 1 + deconvolution capsule of 4×4×4×16 (4 capsules, 16 feature maps) + capsule convolution of 5×5×4×16 (4 capsules, 16 feature maps) with a step size of 1 + deconvolution capsule of 4×4×2×16 (2 capsules, 16 feature maps). When reaching the top layer, that is, after the third deconvolution, the image becomes 64×64×2×16 in size, and then the data after upsampling processing in the extension path module 3 is convolved through the output main capsule 33 and output to the classification module 5. The features in the contraction path module 2 are cropped and copied through the skip connection layer 4 so as to be correspondingly upsampled in the expansion path module 3. The classification module 5 includes a classification capsule layer 52, and the output main capsule layer 33 performs convolution processing on the data after upsampling processing in the expansion path module 3 and outputs it to the classification capsule layer 52; the classification capsule layer 52 includes a plurality of capsules, and the activation vector modulus of each capsule in the plurality of capsules is used to calculate the probability of the existence of an instance of each class.

模型训练和验证Model training and validation

本发明实施例的运算环境基于NVIDIA Quadro P600 GPUs，Keras深度学习平台搭建。Capsules-Unet模型是在一台3.7GHz 8核CPU和32GB内存的计算机上训练的。The computing environment of the embodiment of the present invention is based on NVIDIA Quadro P600 GPUs and Keras deep learning platform. The Capsules-Unet model is trained on a computer with 3.7GHz 8-core CPU and 32GB memory.

参照图2，在模型训练阶段，将所有训练集数据输入Capsules-Unet模型，并采用验证集数据评判不同参数取值下Capsules-Unet模型的运算结果。训练循环的最大次数设置为10000。每次训练步骤的批量输入大小为30，即每次输入30个样本来拟合Capsules-Unet模型。初始学习率是0.001。采用Adam优化方法对Capsules-Unet模型的所有参数进行更新。当误差小于给定阈值或满足最大迭代次数时，上述迭代停止。Referring to Figure 2, during the model training phase, all training set data are input into the Capsules-Unet model, and the validation set data is used to judge the operation results of the Capsules-Unet model under different parameter values. The maximum number of training cycles is set to 10,000. The batch input size of each training step is 30, that is, 30 samples are input each time to fit the Capsules-Unet model. The initial learning rate is 0.001. The Adam optimization method is used to update all parameters of the Capsules-Unet model. When the error is less than a given threshold or the maximum number of iterations is met, the above iteration stops.

分类Classification

在分类阶段，将训练好的Capsules-Unet模型用于待分类数据的分类中，以生成初步的类别预测数据。然后将该预测数据输入到分类胶囊层(ClassCapsule)52，根据预测类别的概率大小生成最终的地物类别。In the classification stage, the trained Capsules-Unet model is used to classify the data to be classified to generate preliminary category prediction data. The prediction data is then input into the classification capsule layer (ClassCapsule) 52 to generate the final feature category according to the probability of the predicted category.

对于涉及k个类的分类任务，分类胶囊层(Class Capsule)52具有k个胶囊，每个胶囊代表一个类。由于胶囊向量输出的长度表示视觉实体的存在，所以最后一层中每个胶囊的长度(||v_c||)表示类胶囊k的概率。对于每个预定义的类，边缘损失中有一个贡献项L_k，这与应用于多分类任务的Softmax相似。For classification tasks involving k classes, the Class Capsule layer 52 has k capsules, each capsule representing a class. Since the length of the capsule vector output indicates the presence of a visual entity, the length of each capsule in the last layer (||v _c ||) represents the probability of class capsule k. For each predefined class, there is a contribution term L _k in the marginal loss, which is similar to Softmax applied to multi-classification tasks.

L_k＝T_kmax(0，m⁺-||v_c||)²+λ_margin(1-T_k)max(0，||v_c||-m^-) (1)L _k =T _k max(0, m ⁺ -||v _c ||) ² +λ _margin (1-T _k )max(0, ||v _c ||-m ^- ) (1)

其中m⁺＝0.9，m^-＝0.1且λ_margin＝0.5。T_k是分类的指示函数,即k存在时为1，不存在时为0。||v_c||表示胶囊的输出概率；m⁺为上界，惩罚假正例(false positive)，即预测k类存在实则不存在，m^-为下界，惩罚假反例(false negative)，即预测k类不存在实则存在；λ_margin为比例系数，调整两者比重。Where m ⁺ = 0.9, m ^- = 0.1 and λ _margin = 0.5. _{T k} is the indicator function of the classification, that is, 1 when k exists and 0 when it does not exist. ||v _c || represents the output probability of the capsule; m ⁺ is the upper bound, which penalizes false positives, that is, predicting that class k exists but does not exist; m ^- is the lower bound, which penalizes false negatives, that is, predicting that class k does not exist but does exist; λ _margin is the proportional coefficient, which adjusts the proportion of the two.

参见图5，根据本发明的另一方面，提供一种基于Capsules-Unet模型的遥感图像深度学习分类系统，包括：数据预处理单元101，用于对遥感图像数据进行数据预处理，将预处理后的遥感图像数据分为训练集数据和验证集数据；模型建立单元102:以Unet模型为基本网络架构，融合胶囊(Capsules)模型，建立Capsules-Unet模型；模型训练单元103:利用所述训练集数据和所述验证集数据对所述Capsules-Unet模型进行训练，得到训练好的所述Capsules-Unet模型；分类单元104:利用训练好的所述Capsules-Unet模型对待分类的遥感图像数据进行分类。Referring to FIG5 , according to another aspect of the present invention, a remote sensing image deep learning classification system based on a Capsules-Unet model is provided, including: a data preprocessing unit 101, for performing data preprocessing on remote sensing image data, and dividing the preprocessed remote sensing image data into training set data and verification set data; a model building unit 102: using the Unet model as the basic network architecture, integrating the capsule (Capsules) model, and establishing a Capsules-Unet model; a model training unit 103: using the training set data and the verification set data to train the Capsules-Unet model to obtain the trained Capsules-Unet model; a classification unit 104: using the trained Capsules-Unet model to classify the remote sensing image data to be classified.

根据本发明的一个可选实施方式，在所述模型训练单元103中还包括模型验证子单元，用于采用所述验证集数据对所述Capsules-Unet模型进行验证。According to an optional embodiment of the present invention, the model training unit 103 also includes a model verification subunit, which is used to verify the Capsules-Unet model using the verification set data.

局部约束动态路由算法Local Constraint Dynamic Routing Algorithm

在原始的胶囊网络和动态路由算法需要占用大量内存，模型运行极为耗时。因为当动态路由算法确定子胶囊路由到下一层中父胶囊的系数时，需要额外的中间表示来存储给定层中子胶囊的输出。因此，为了解决内存负担过大和参数爆炸的问题，本发明在原始的动态路由算法上提出了改进的局部约束动态路由算法。The original capsule network and dynamic routing algorithm require a lot of memory and the model operation is extremely time-consuming. This is because when the dynamic routing algorithm determines the coefficients of the child capsule routing to the parent capsule in the next layer, an additional intermediate representation is required to store the output of the child capsule in a given layer. Therefore, in order to solve the problems of excessive memory burden and parameter explosion, the present invention proposes an improved local constraint dynamic routing algorithm based on the original dynamic routing algorithm.

根据本发明的改进的局部约束动态路由算法的一个实施方式，Capsules-Unet模型的子胶囊(本层胶囊)的数据路由到父胶囊(下一层胶囊)的数据时，对于同一类型的子胶囊和父胶囊，采用相同的转换矩阵。According to an embodiment of the improved local constrained dynamic routing algorithm of the present invention, when the data of the child capsule (capsule of the current layer) of the Capsules-Unet model is routed to the data of the parent capsule (capsule of the next layer), the same conversion matrix is used for the child capsule and the parent capsule of the same type.

如图6所示。在l层中存在一组胶囊类型

对于每一个

都存在一个h^l×w^l _×z ^l的子胶囊

其中h^l×w^l是l-1层输出的大小。在网络的l+1层有一组胶囊类型

对于每一个

存在h^l+1×w^l+1×z^l+1的父胶囊

其中h^l+1×w^l+1是l层输出的大小。As shown in Figure 6. There is a set of capsule types in layer l

For each

There exists a subcapsule of h ^l ×w ^l _×z ^l

Where h ^l × w ^l is the size of the output of layer l-1. In the l+1 layer of the network there is a set of capsule types

For each

There exists a parent capsule of h ^l+1 ×w ^l+1 ×z ^l+1

where h ^l+1 ×w ^l+1 is the size of the output of layer l.

以卷积胶囊层12为例，每个父胶囊p_xy∈P会接收一组“预测向量”,

该集合被定义为在第l层中以(x,y)为中心的核内，变换矩阵

和子胶囊的输出

之间的矩阵乘法，即对于任意的

存在

因此，我们可以看到每个

都有形状k_h×k_w×z^l，其中kh×kw是自定义内核的大小。对于所有的胶囊类型T^l，每个

的形状都是kh×kw×z^l×|T^l+1|×z^l+1，其中|T^l+1|是l+1层中的父胶囊类型数。值得注意的是每个

与空间位置(x,y)无关，因为相同的变换矩阵在指定胶囊类型的所有空间位置上共享。简单来讲，局部约束动态路由是在底层的每一个胶囊内做卷积，每一个胶囊都卷积出与高层的所有胶囊维度相同的张量，而后对于每一个底层胶囊卷积的结果做路由选择。这是本文可以利用矩阵共享来显著减少参数数量的原因。Taking the convolutional capsule layer 12 as an example, each parent capsule p _xy ∈ P receives a set of “prediction vectors”,

This set is defined as the transformation matrix within the kernel centered at (x,y) in layer l.

Output of the and sub-capsules

The matrix multiplication between

exist

Therefore, we can see that each

have the shape k _h × k _w × z ^l , where kh × kw is the size of the custom kernel. For all capsule types T ^l , each

The shape of is kh×kw×z ^l ×|T ^l+1 |×z ^l+1 , where |T ^l+1 | is the number of parent capsule types in the l+1 layer. It is worth noting that each

It has nothing to do with the spatial position (x, y) because the same transformation matrix is shared across all spatial positions of a given capsule type. Simply put, local constrained dynamic routing is to perform convolutions in each capsule at the bottom layer, and each capsule convolves a tensor with the same dimension as all capsules at the top layer, and then performs routing selection for the result of the convolution of each bottom-layer capsule. This is why this paper can use matrix sharing to significantly reduce the number of parameters.

为了确定每个父胶囊p_xy∈P的最终输入，计算这些“预测向量”

上的加权和，其中

是由局部约束动态路由算法确定的路由系数。这些路由系数由“RoutingSoftmax”计算，To determine the final input for each parent capsule _pxy∈P , these “prediction vectors” are calculated

The weighted sum of

are the routing coefficients determined by the local constrained dynamic routing algorithm. These routing coefficients are calculated by "RoutingSoftmax",

其中

是子胶囊路由到父胶囊p_xy的先验概率，初始化为0，

的迭代更新方式如下：in

is the prior probability of the child capsule being routed to the parent capsule p _xy , initialized to 0,

The iterative update method is as follows:

的计算与当前的输入图像无关，它取决于两个胶囊的位置和类型，然后通过测量上述层中的每个胶囊的当前输出v_xy与预测向量

之间的一致性来迭代地改进初始耦合系数。

The calculation of is independent of the current input image, it depends on the position and type of the two capsules, and then by measuring the current output v _xy of each capsule in the above layer with the predicted vector

The initial coupling coefficient is iteratively improved by comparing the consistency between them.

在Capsulse-Unet模型中，因为在前层网络中以向量的形式输送，所以在做激活时要对“胶囊”进行方向处理。Capsulse-Unet模型的激活函数命名为Squashing，表达式如式(4)所示：In the Capsulse-Unet model, because it is transmitted in the form of vectors in the previous layer network, the direction of the "capsule" must be processed when activating. The activation function of the Capsulse-Unet model is named Squashing, and the expression is shown in formula (4):

其中p_xy和v_xy分别表示capsule j的输入向量和在空间位置(x,y)的输出向量，p_xy实际上是上一层所有输出到capsule j的向量加权和。公式中的前一部分所代表的物理意义是输入向量p_xy的缩放尺度，第二部分表示p_xy的单位向量。该Squashing函数在保证了输入向量的取值范围在0-1之间的同时，也将输入向量的方向保留了下来。当||p_xy||为零时，v_xy的值接近0；当||p_xy||无穷大时v_xy无限接近1。Where p _xy and v _xy represent the input vector of capsule j and the output vector at the spatial position (x, y) respectively. p _xy is actually the weighted sum of all vectors output to capsule j in the previous layer. The physical meaning of the first part of the formula is the scaling of the input vector p _xy , and the second part represents the unit vector of p _xy . The squashing function ensures that the value range of the input vector is between 0 and 1 while also preserving the direction of the input vector. When ||p _xy || is zero, the value of v _xy is close to 0; when ||p _xy || is infinite, v _xy is infinitely close to 1.

局部约束动态路由实现胶囊之间耦合系数的更新的动态过程如图7所示。在动态路由的第一次迭代中，由于临时变量

的值都被初始化为零，因此胶囊i到层l+1中所有胶囊的耦合系数都相等，然后对接收到的所有输入

进行加权求和得到p_xy，其中权值为各耦合系数

接着p_xy继续按照公式(4)中的Squashing函数进行非线性变换得v_xy，最后遵循

的原则对

进行更新，执行完r次的迭代后才能返回胶囊的输出v_xy。通常，实验中的最佳迭代次数为3。The dynamic process of updating the coupling coefficients between capsules by local constrained dynamic routing is shown in Figure 7. In the first iteration of dynamic routing, due to the temporary variable

The values of are initialized to zero, so the coupling coefficients from capsule i to all capsules in layer l+1 are equal, and then for all received inputs

Perform weighted summation to obtain p _xy , where the weights are the coupling coefficients

Then p _xy continues to be nonlinearly transformed according to the Squashing function in formula (4) to obtain v _xy , and finally follows

The principle of

After updating, the output v _xy of the capsule can be returned after r iterations. Usually, the best number of iterations in the experiment is 3.

根据本发明的改进的局部约束动态路由算法的另一实施方式，在模型训练和分类步骤中，Capsules-Unet模型的子胶囊的数据路由到下一层父胶囊的数据时，子胶囊仅在一个定义的本地窗口路由到父胶囊。如图5所示，仅对l层的胶囊的局部窗口区域k_h×k_w×z^l进行采样，而非对胶囊的整体进行运算，从而大大降低了运算量，降低其运算内存负担和减少数据参数，虽然由于局部窗口采样会使得部分数据信息失真，但是Capsules-Unet模型中的跳跃连接层能够裁剪并复制收缩路径模块2中的低级特征，从而对部分数据信息失真进行一定程度的补偿。According to another embodiment of the improved local constraint dynamic routing algorithm of the present invention, in the model training and classification steps, when the data of the child capsule of the Capsules-Unet model is routed to the data of the parent capsule of the next layer, the child capsule is routed to the parent capsule only in a defined local window. As shown in FIG5 , only the local window area k _h × k _w × z ^l of the capsule of the lth layer is sampled, rather than the whole capsule being operated, thereby greatly reducing the amount of operation, reducing its operation memory burden and reducing data parameters. Although the local window sampling may cause some data information to be distorted, the jump connection layer in the Capsules-Unet model can cut and copy the low-level features in the contraction path module 2, thereby compensating for some data information distortion to a certain extent.

评价方法Evaluation Method

在本发明中，使用总体精度(OA)和Kappa系数来评估所本发明方法的分类效果。总体精度是所有测试集中正确分类图像的百分比。Kappa系数是另一种应用广泛的评价标准，它是基于混淆矩阵来评价遥感分类精度的。方程如下：In the present invention, the overall accuracy (OA) and Kappa coefficient are used to evaluate the classification effect of the method of the present invention. The overall accuracy is the percentage of correctly classified images in all test sets. The Kappa coefficient is another widely used evaluation criterion, which is based on the confusion matrix to evaluate the accuracy of remote sensing classification. The equation is as follows:

中N是总样本数，n是类别数，M_ij是混淆矩阵C[41]的第(i，j)个值，M_i+和M_+i分别表示C的第i行和第i列的和。where N is the total number of samples, n is the number of categories, _Mij is the (i, j)th value of the confusion matrix C[41], and Mi ₊ and M _+i represent the sum of the i-th row and i-th column of C, respectively.

对比分析Comparative Analysis

本发明将Capsules-Unet模型与CapsNet和Unet进行了比较，得到如下结果：Capsules-Unet模型和CapsNet均在不透水层、建筑物和低矮植被的分类中取得了较好的效果。这四类地物具有较好的区域连通性和清晰的地物边缘。然而，对于车辆和其他独特的、覆盖范围较小的类别没有得到很好的识别，并且与建筑物的混分现象非常明显。在Vaihingen数据集中同质区域较小，在进行特征提取时不够精确。虽然本发明提出的Capsules-Unet模型和CapsNet模型均采用“胶囊”形式可以保留地物间的空间信息，但是在同质区域小的情况下，对于汽车等小目标区域，其空间信息往往是有限的。Unet模型对不透水层和低层植被的分类有很好的效果，但建筑物边界不清楚。The present invention compares the Capsules-Unet model with CapsNet and Unet, and obtains the following results: Both the Capsules-Unet model and CapsNet achieve good results in the classification of impermeable layers, buildings and low vegetation. These four types of objects have good regional connectivity and clear object edges. However, vehicles and other unique categories with smaller coverage are not well identified, and the mixed classification phenomenon with buildings is very obvious. In the Vaihingen data set, the homogeneous area is small, and it is not accurate enough when performing feature extraction. Although the Capsules-Unet model and the CapsNet model proposed in the present invention both use the "capsule" form to retain the spatial information between objects, when the homogeneous area is small, the spatial information of small target areas such as cars is often limited. The Unet model has a good effect on the classification of impermeable layers and low vegetation, but the boundaries of buildings are unclear.

表2列出了三种模型在Vaihingen数据集得分类精度。从表中可以看出，本发明的方法达到了很高的分类性能。从总体准确度(OA)来看，Capsules-Unet的总体精度(OA)比CapsNet高1.22％，比Unet高1.89％。Capsules-Unet、CapsNet和Unet模型的Kappa系数分别为0.74、0.74和0.72。Capsules-Unet在Vaihingen数据集中的分类精度略优于CapsNet和Unet。Table 2 lists the classification accuracy of the three models on the Vaihingen dataset. It can be seen from the table that the method of the present invention achieves very high classification performance. From the perspective of overall accuracy (OA), the overall accuracy (OA) of Capsules-Unet is 1.22% higher than that of CapsNet and 1.89% higher than that of Unet. The Kappa coefficients of Capsules-Unet, CapsNet and Unet models are 0.74, 0.74 and 0.72, respectively. The classification accuracy of Capsules-Unet in the Vaihingen dataset is slightly better than that of CapsNet and Unet.

表2.Vaihingen数据集和Potsdam数据集的精度评价结果Table 2. Accuracy evaluation results of Vaihingen dataset and Potsdam dataset

本发明的特征和益处通过参考实施例进行说明。相应地，本发明明确地不应局限于这些说明一些可能的非限制性特征的组合的示例性的实施例，这些特征可单独或者以特征的其它组合的形式存在。The features and benefits of the present invention are explained by reference to the embodiments. Accordingly, the present invention should expressly not be limited to these exemplary embodiments which illustrate some possible non-limiting combinations of features, which may exist alone or in other combinations of features.

以上所述实施例，仅为本发明的具体实施方式，用以说明本发明的技术方案，而非对其限制，本发明的保护范围并不局限于此，尽管参照前述实施例对本发明进行了详细的说明，本领域的普通技术人员应当理解：任何熟悉本技术领域的技术人员在本发明揭露的技术范围内，其依然可以对前述实施例所记载的技术方案进行修改或可轻易想到变化，或者对其中部分技术特征进行等同替换；而这些修改、变化或者替换，并不使相应技术方案的本质脱离本发明实施例技术方案的精神和范围，都应涵盖在本发明的保护范围之内。因此，本发明的保护范围应以权利要求的保护范围为准。The above-described embodiments are only specific implementations of the present invention, which are used to illustrate the technical solutions of the present invention, rather than to limit them. The protection scope of the present invention is not limited thereto. Although the present invention is described in detail with reference to the above-mentioned embodiments, ordinary technicians in the field should understand that any technician familiar with the technical field can still modify the technical solutions recorded in the above-mentioned embodiments within the technical scope disclosed by the present invention, or can easily think of changes, or make equivalent replacements for some of the technical features therein; and these modifications, changes or replacements do not make the essence of the corresponding technical solutions deviate from the spirit and scope of the technical solutions of the embodiments of the present invention, and should be included in the protection scope of the present invention. Therefore, the protection scope of the present invention shall be based on the protection scope of the claims.

Claims

1. A remote sensing image deep learning classification method based on the Capsules-Unet model, comprising:

Step S1: preprocessing the remote sensing image data, dividing the preprocessed remote sensing image data into training set data and verification set data;

Step S2: Taking the Unet model as the basic network architecture, integrating the capsule model, and establishing the Capsules-Unet model;

The Capsules-Unet model includes:

A feature extraction module, comprising an input convolution layer and a convolution capsule layer, wherein the input convolution layer is used to extract low-level features of an input remote sensing image; the convolution capsule layer is used to convert the low-level features extracted by the convolution layer into capsules through convolution filtering;

A contraction path module, comprising a plurality of main capsule layers, for downsampling the capsules obtained by the feature extraction module;

an expansion path module, comprising a plurality of main capsule layers and a plurality of deconvolution capsule layers, the plurality of main capsule layers and the plurality of deconvolution capsule layers being interlaced with each other, and configured to upsample capsules from the contraction path module; and an output main capsule layer, configured to convolute the data obtained by the upsampling process in the expansion path module and output the convolution process to the classification module;

A skip connection layer, wherein the expansion path module cuts and copies low-level features in the contraction path module through the skip connection layer for upsampling processing in the expansion path module;

A classification module, comprising a classification capsule layer, wherein the classification capsule layer comprises a plurality of capsules, wherein the activation vector modulus of each capsule in the plurality of capsules is used to calculate the probability of the existence of an instance of each class;

Step S3: training the Capsules-Unet model using the training set data and the validation set data to obtain the trained Capsules-Unet model;

Step S4: Classify the remote sensing image data to be classified using the trained Capsules-Unet model.

2. The remote sensing image deep learning classification method according to claim 1 is characterized in that: in step S3, it also includes a step of using the verification set data to verify the Capsules-Unet model, including setting the training iteration to stop when the error is less than a given threshold or meets the maximum number of iterations, thereby completing the training of the Capsules-Unet model.

3. The remote sensing image deep learning classification method according to claim 1 is characterized in that: in step S3 and step S4, an improved local constraint dynamic routing algorithm is adopted in the Capsules-Unet model, so that when the data of the child capsule is routed to the data of the parent capsule of the next layer, the same transformation matrix is used for the child capsule and the parent capsule of the same type.

4. The remote sensing image deep learning classification method according to claim 1 or 3, characterized in that: in step S3 and step S4, an improved local constrained dynamic routing algorithm is adopted in the Capsules-Unet model, so that when the data of the child capsule is routed to the data of the parent capsule of the next layer, the child capsule is routed to the parent capsule only in a defined local window.

5. A remote sensing image deep learning classification system based on the Capsules-Unet model, comprising:

A data preprocessing unit is used to perform data preprocessing on remote sensing image data, and divide the preprocessed remote sensing image data into training set data and verification set data;

Model building unit: Using the Unet model as the basic network architecture, integrating the capsule model, and building the Capsules-Unet model;

The Capsules-Unet model includes:

Model training unit: training the Capsules-Unet model using the training set data and the validation set data to obtain the trained Capsules-Unet model;

Classification unit: Use the trained Capsules-Unet model to classify the remote sensing image data to be classified.

6. The remote sensing image deep learning classification system according to claim 5 is characterized in that: the model training unit also includes a model verification subunit, which is used to verify the Capsules-Unet model using the verification set data, including setting the training iteration to stop when the error is less than a given threshold or meets the maximum number of iterations, thereby completing the training of the Capsules-Unet model.

7. The remote sensing image deep learning classification system according to claim 5 is characterized in that: in the model training unit and the classification unit, an improved local constraint dynamic routing algorithm is adopted in the Capsules-Unet model, so that when the data of the child capsule is routed to the data of the parent capsule of the next layer, the same transformation matrix is used for the child capsule and the parent capsule of the same type.

8. The remote sensing image deep learning classification system according to claim 5 or 7, characterized in that: in the model training unit and the classification unit, an improved local constrained dynamic routing algorithm is adopted in the Capsules-Unet model, so that when the data of a child capsule is routed to the data of a parent capsule of the next layer, the child capsule is routed to the parent capsule only in a defined local window.