CN114187527B

CN114187527B - Migration learning ship target segmentation method based on linear heating and snapshot integration

Info

Publication number: CN114187527B
Application number: CN202111427112.1A
Authority: CN
Inventors: 张修社; 韩春雷; 亓子龙; 任子豪; 孙晓龙
Original assignee: CETC 20 Research Institute
Current assignee: CETC 20 Research Institute
Priority date: 2021-11-28
Filing date: 2021-11-28
Publication date: 2022-12-27
Anticipated expiration: 2041-11-28
Also published as: CN114187527A

Abstract

The invention provides a migration learning ship target segmentation method based on linear heating and snapshot integration, which comprises the steps of taking a data set with uniform size distribution of ship targets as a source domain data set, taking a data set with verified segmentation effect as a target domain data set, constructing a global attention-based coding and decoding network of a source domain, obtaining a model with the highest segmentation accuracy on a source domain test set, constructing a global attention-based coding and decoding network of a target domain, obtaining a final segmentation model, and testing the target domain test data set by using the final segmentation model to obtain a final segmentation result. The method can realize the ship target segmentation under fewer labels on the target domain, solves the problem of large demand of the labels of the target domain to a certain extent, integrates the optimal model in each cycle of the cyclic cosine annealing, avoids the condition that the training is unstable when the data volume of the target domain is less so as to cause negative migration, and enhances the robustness of the model.

Description

Migration learning ship target segmentation method based on linear heating and snapshot integration

Technical Field

The invention relates to the field of image segmentation, in particular to a transfer learning method which can be used for intermediate processing of SAR image interpretation.

Background

In recent years, with the development of synthetic aperture radar systems, acquired information is gradually transferred from land to the sea, and how to solve the problem of small sample ship target segmentation of the SAR image becomes an urgent need to be solved at present. In recent years, with the excellent performance of deep learning in the fields of computer vision, speech signal processing, natural language processing and the like, how to combine the deep learning method with the problem of SAR image ship target segmentation also becomes a hotspot problem in the field of SAR image processing nowadays. The deep learning method is characterized in that inherent attribute characteristics of training data are continuously mined through a thought of training and learning layer by layer, and further, abstract representation of the data is realized.

The Chenyangtang et al, in the article "remote sensing image sea surface ship detection research based on depth semantic segmentation", proposes a segmentation method, which is based on the ResNet architecture, firstly, the remote sensing image is taken as input through a depth convolution neural network, the image is roughly segmented, then through an improved full-connection conditional random field, a conditional random field is established by using Gauss paired potential and average field approximate theorem as an output through a recurrent neural network, thereby realizing the end-to-end connection.

Wang 2815635 proposes an SAR image ship detection segmentation method based on a three-dimensional cavity convolution neural network in a paper 'multiscale CNN method in image segmentation', the method constructs a three-dimensional image block based on multiscale by adding image wavelet features, and the three-dimensional image block is used as the input of the three-dimensional cavity convolution neural network, so that the capability of the network in extracting target global features and local features is improved. The three-dimensional cavity convolution neural network adopts an end-to-end network structure, the network output is the final output result, and the model is convenient to use and has higher efficiency.

However, the method is limited by the problem that SAR image data in a complex large scene is small in scale, the model is often insufficient in generalization capability, namely, the model is often superior in performance on source domain data but has a performance decline phenomenon on a target domain, a means for formally solving the problem is migrated learning, perhaps a universal model is not available, but a model with acceptable performance can be transformed and adapted to realize a personalized task of a specific scene. Therefore, the transfer learning is widely applied to solving the detection and identification problems of the heterogeneous images.

However, because parameters such as transfer learning and learning rate have close relation, the method is not well suitable for SAR image ship target segmentation by directly transferring.

Disclosure of Invention

In order to overcome the defects of the prior art, the invention provides a migration learning ship target segmentation method based on linear heating and snapshot integration, which is a migration learning ship target segmentation method based on linear heating, cyclic cosine annealing and snapshot integration, and improves the segmentation effect on the premise of reducing the number of labeled data required by data on a target domain.

The technical scheme adopted by the invention for solving the technical problem comprises the following steps:

(1) Taking a data set with uniform size distribution of ship targets as a source domain data set, and dividing the source domain data set into a source domain training data set and a source domain testing data set in proportion;

(2) Taking a data set with a verification segmentation effect as a target domain data set, and dividing the target domain data set into a target domain training data set, a target domain verification data set and a target domain test data set according to a proportion;

(3) Constructing a global attention coding and decoding network of a source domain, training the global attention coding and decoding network by using a source domain training data set, keeping a model with the highest segmentation accuracy on a source domain test set after each iteration, and obtaining the model with the highest segmentation accuracy on the source domain test set after the maximum iteration times are reached; the method comprises the following specific steps:

(3a) Constructing a global attention-based coding and decoding network of a source domain, wherein the global attention-based coding and decoding network comprises a cascaded input layer, a cascaded coding layer, a cascaded decoding layer and a cascaded output layer, and the coding layer and the decoding layer are connected by a global attention module;

(3b) Setting a training optimizer as SGD, and adopting cross entropy loss as a loss function;

(3c) Training a global attention-based coding and decoding network of a source domain by using a source domain training data set by adopting a small-batch gradient descent algorithm, testing by using a source domain test data set after each iteration in the training process until the maximum iteration number is reached, and obtaining an optimal model M on the model most source domain with the best test result on the source domain test data set _S ；

(4) The global attention coding and decoding network of the target domain is constructed in the same way as the global attention coding and decoding network of the source domain, and the global attention coding and decoding network model M of the target domain _T By the parameter M _S Initializing the parameters;

(5) Setting a learning rate at an early stage of training by using a linear heating strategy in a learning rate setting method, setting a learning rate at a later stage of training by using a cyclic cosine annealing strategy in the learning rate setting method, integrating by using a snapshot integration strategy in an integration method, wherein a model reserved in the snapshot integration strategy is a model with the highest segmentation accuracy on a target domain verification set in each cycle in the cyclic cosine annealing strategy, an optimizer adopts SGD (generalized minimum variance Detector), a loss function adopts a cross entropy loss function, and the last three models reserved in the snapshot integration strategy are obtained final segmentation models M _E ；

(6) Using the final segmentation model M _E And testing the target domain test data set to obtain a final segmentation result.

The source domain data set is divided into a source domain training data set and a source domain testing data set according to the proportion of 4;

the cross entropy loss function is represented as;

wherein X represents the number of samples, C represents the number of categories,

label representing sample x, when sample x is class c

Is 1, is other class

Is a non-volatile organic compound (I) with a value of 0,

representing the probability that sample x is predicted as class c.

The optimal model M _S Comprises the following steps: keeping the current model with the highest segmentation accuracy on the source domain test set after each iteration until the maximum iteration number E is reached _S Then, obtaining a model with the highest segmentation accuracy on the source domain test data set, and taking the model as an optimal model M on the source domain _S ；

The specific steps of the step (5) are as follows:

(5a) Setting the number of iteration rounds E of the initial linear heating _l Initial learning rate lr _l Initial learning rate lr of cyclic cosine annealing in snapshot integration strategy _m Total number of rounds of iteration E during the transformation period _m And the number of cycles used, n;

(5b) Fine-tuning training using k% of the target domain training dataset, front E _l Setting the learning rate using a linear heating strategy in round iterations, followed by E _m In the round iteration, a cyclic cosine annealing strategy is used for setting the learning rate, and in the cyclic cosine annealing stage, every E _m Saving current E after n iterations _m Model with highest segmentation accuracy on target domain verification set in n iterations

(5c) Selection using snapshot integration strategy

Taking the three models as a final segmentation model M _E Wherein i = n-3, n-2, n-1, when in segmentation, the segmentation results of the three models are averaged to obtain a final segmentation model M _E 。

The values in the step (5 a) are as follows: number of iteration rounds E of initial linear heating _l Initial learning rate lr of 10 _l 0.0001, initial cycle cosine anneal in Snapshot integration strategyInitial learning rate lr _m 0.01, total number of rounds of iteration E during the transformation period _m The number of cycles used, n, was 5, which was 50.

The invention has the beneficial effects that:

1) The heterogeneous SAR image ship target under a small number of labels can be segmented.

The invention provides a migration learning ship target segmentation method based on linear heating and snapshot integration, which is characterized in that a data set relatively rich in ship targets is used as source domain data, the data set is used for training, and the obtained training model can realize ship target segmentation under fewer labels on a target domain under the fine adjustment of a small amount of data of the target domain, so that the problem of large demand of the labels of the target domain is solved to a certain extent.

2) The use of multiple strategies enhances the robustness of the model.

In the invention, for the fine tuning part after the migration, the strategy of linear heating and cyclic cosine annealing is used for the learning rate of the part, and the snapshot integration strategy is used, so that the optimal model in each cycle of the cyclic cosine annealing is integrated, the condition that the training is unstable when the data volume of the target domain is less so as to cause negative migration is avoided, and the robustness of the model is enhanced.

Drawings

FIG. 1 is a flow chart of an implementation of the present invention;

FIG. 2 is a ship size distribution condition in an SAR image of a Qingdao area in the invention;

FIG. 3 is a graph of the effect of different migration data volumes when hong Kong area is used as the target domain in the present invention; where, graph (a) is the 0% data migration using 0% data migration, graph (b) is the 0% data migration using 2% data migration, graph (c) is the 0% data migration using 3% data migration, graph (d) is the 0% data migration using 4% data migration, graph (e) is the 0% data migration using 5% data migration, and graph (f) is the 0% data migration using 10% data migration.

FIG. 4 is a graph of the effect of different migration data volumes when the Shanghai region is used as the target domain; where, graph (a) is the 0% data migration using 0% data migration, graph (b) is the 0% data migration using 2% data migration, graph (c) is the 0% data migration using 3% data migration, graph (d) is the 0% data migration using 4% data migration, graph (e) is the 0% data migration using 5% data migration, and graph (f) is the 0% data migration using 10% data migration.

FIG. 5 is a graph showing the effect of different migration data volumes when IstanBoolean harbor is the target domain in the present invention. Where, graph (a) is the 0% data migration using 0% data migration, graph (b) is the 0% data migration using 2% data migration, graph (c) is the 0% data migration using 3% data migration, graph (d) is the 0% data migration using 4% data migration, graph (e) is the 0% data migration using 5% data migration, and graph (f) is the 0% data migration using 10% data migration.

Detailed Description

The invention is further illustrated by the following examples in conjunction with the drawings.

Referring to fig. 1, the implementation steps of the present invention include the following:

step 1, taking a data set rich in ship targets as a source domain data set, and dividing the data set into a source domain training data set and a source domain testing data set according to the proportion of 4.

Step 2, taking a data set for verifying the segmentation effect as a target domain data set, and dividing the data set into a target domain training data set, a target domain verifying data set and a target domain testing data set according to the proportion of 4;

and 3, constructing a global attention coding and decoding network of the source domain, and training by using a source domain training data set to obtain an optimal model on the source domain verification set.

(3.1) constructing a global attention coding and decoding network of a source domain, wherein the global attention coding and decoding network comprises an input layer, an encoding layer, a decoding layer, an output layer and a global attention module;

(3.1 a) in an input layer, performing wavelet decomposition on an input image to obtain a plurality of images with the same size as the input image, and overlapping the input image and each image subjected to wavelet decomposition to construct a 3D input image block;

(3.1 b) in the coding layer, extracting features from the image blocks obtained by the input layer to obtain coding features of the input image;

(3.1 c) the global attention module fuses the high-dimensional semantic features and the low-dimensional position features to obtain features fusing semantic and position information;

(3.1 d) in a decoding layer, decoding the features passing through the attention module to obtain a feature map with continuously increased scale;

(3.1 e) obtaining a pixel-level segmentation result in the output layer through the convolution layer and the softmax layer;

(3.1 f) the network consists of a cascade of input, encoding, global attention, decoding and output layers.

(3.2) randomly initializing network parameters, setting a training optimizer as SGD, wherein a cross entropy loss function is adopted as a loss function, and the cross entropy loss function is expressed as;

wherein X represents the number of samples, C represents the number of classes,

labels indicating sample x, when sample x is of class c

Is 1, is other

Is a non-volatile organic compound (I) with a value of 0,

represents the probability that sample x is predicted as class c;

(3.3) training the network, testing by using the source domain test data set after each iteration in the training process, comparing with a historical result, and keeping a model with the best test result;

(3.4) repeating (3.3) till the iteration reaches the maximum iteration round number, and terminating training to obtain the optimal model M on the verification set _S 。

Step 4, constructing a global attention coding and decoding network of the target domain, and a model M thereof _T By M _S Is initialized.

Step 5, migration learning is carried out by using a linear heating and snapshot integration strategy, an SGD (generalized regression) is adopted by an optimizer, a cross entropy loss function is adopted by the loss function, a plurality of models of snapshot integration are obtained by a cyclic cosine annealing strategy, and a final segmentation model M is obtained _E 。

(5.1) setting the number of iteration rounds E of the initial linear heating _l Initial learning rate lr _l And initial learning rate lr of cyclic cosine annealing _m Total number of rounds of iteration E during the transformation period _m And the number of cycles used, n;

(5.2) setting the learning rate to change according to the settings of linear heating and cyclic cosine annealing;

(5.2 a) in the early stage of training, the learning rate is changed in a linear heating mode, and the learning rate is slowly increased to lr from zero _l The mathematical expression of the learning rate is:

wherein, lr _l Is the maximum learning rate of the linear heating stage, s is the current iteration step number, E _l And S is the iteration total number of each iteration.

(5.2 b) in the later stage of training, the change of the learning rate meets a cosine period attenuation function, the learning rate is suddenly increased from the minimum value to the initially set maximum learning rate after each generation of training, and the mathematical expression of the learning rate is as follows:

wherein, lr _m For the initially set maximum learning rate, t is the current iteration number, t is 0 when the linear heating phase ends _m Is the total number of iterations during the decay of the cosine period, and n is the number of periods.

(5.3) performing fine tuning training by using k% of a target domain training data set, wherein the learning rate is changed according to the setting of linear heating and cyclic cosine annealing, and after the linear heating stage, E is performed every time _m N iterations, save current E _m Model for optimal representation on verification set in n iterations

(5.3) selection Using Snapshot Integrated policy

The three models are used for integration to obtain a final segmentation model M _E Wherein the segmentation model M is integrated from snapshots _E Can be represented as;

wherein M is _E Representing the final integrated model and x representing the input image to be segmented.

Step 6, using the model M _E And testing the target domain test data set to obtain a final segmentation result.

(6.1) sequentially taking a sample from the target domain test set and inputting the sample into the trained integrated model to obtain a segmentation result pred corresponding to the sample;

and (6.2) repeating (6.1) until all the query images of the target domain test set obtain the segmentation result, and ending the test.

The effects of the present invention can be further illustrated by the following simulations.

Simulation data

The general data set comprises a Qingdao area data set, a hong Kong area data set, a Shanghai area data set and an Isteboolean harbor data set, the ship target size distribution of the Qingdao area is relatively average, large and small ship targets are few, the distribution graph of the Qingdao area ship target size is shown in figure 2, the size distribution greatly helps the training of the model, and therefore the data set of the Qingdao area is selected as a source domain data set, and the data sets of other three areas are respectively selected as target domain data sets.

Emulation content

The method comprises the steps of adopting a global attention coding and decoding network (GAM-EDNet) and an existing PSPNet and DCNN based three segmentation method as comparison methods, carrying out comparison experiments with the global attention coding and decoding network under the condition of transfer learning in the invention, wherein training data in the experiments are the same, carrying out verification experiments respectively by using hong Kong area, shanghai area and Instantbuer harbor area data sets, respectively verifying the effects of the method in the invention under the condition that the transfer data amount is 0%, 2%, 3%, 4%, 5% and 10%, respectively, using a cross-over ratio, a weighted cross-over ratio and a Kappa coefficient as evaluation indexes, and showing a segmentation effect diagram in fig. 3, wherein fig. 3 is an effect diagram of different transfer data amounts when hong Kong area is used as a target domain, fig. 4 is an effect diagram of different transfer data amounts when Shanghai area is used as a target domain, fig. 5 is an effect diagram of different transfer data amounts when Itanian Istebull harbor area is used as a target domain, and tables 1 and 2 are respectively different transfer data amounts under the condition that the target domain of the method under the condition that an algorithm is carried out.

TABLE 1 comparison of different algorithm results in hong Kong area

TABLE 2 results of the method in hong Kong area under different migration data volume

Tables 3 and 4 are the comparison of the results of different algorithms and the results of the method at different migration data volumes when the target domain is the Shanghai region, respectively, wherein 10% of the target domain data is used for training in table 3.

TABLE 3 comparison of different algorithm results in Shanghai region

TABLE 4 results of the method in hong Kong area under different migration data volume

Tables 5 and 6 are a comparison of the results of the different algorithms in the case of the target domain being the IstanBoolean harbor region and the results of the method at different migration data volumes, respectively, wherein 10% of the target domain data was used for training in Table 5.

TABLE 5 comparison of different algorithm results for Isteboolean harbor region

TABLE 4 results of the method in IstanBoolean harbor at different migration data volumes

Simulation effect analysis

From tables 1, 3 and 5, it can be seen that compared with other methods, the method can achieve the optimal effect under the condition of using 10% of training data of the target domain, thereby proving that the method achieves better effect under the condition of less training data of the target domain.

As can be seen from tables 2, 4 and 6, in the three regions, as the migration amount of the target domain data increases, the segmentation performance becomes better, and as can be seen from fig. 4, as the migration amount of the target domain data increases, the region consistency of the segmentation result becomes better.

From the simulation results, the method can achieve a good effect under the condition that the target domain training data are less, and fully explains the effectiveness of the migration learning and the linear heating and snapshot integration strategy used in the method.

Claims

1. A migration learning ship target segmentation method based on linear heating and snapshot integration is characterized by comprising the following steps:

(1) Taking a data set with uniformly distributed sizes of ship targets as a source domain data set, and dividing the source domain data set into a source domain training data set and a source domain testing data set in proportion;

(3b) Setting a training optimizer as SGD, wherein a loss function adopts cross entropy loss;

(3c) Training a global attention-based coding and decoding network of a source domain by using a source domain training data set by adopting a small-batch gradient descent algorithm, and testing by using a source domain test data set after each iteration in the training processTesting until reaching maximum iteration times to obtain the best model M on the source domain with the best test result on the source domain test data set _S ；

(4) The global attention coding and decoding network of the target domain is constructed in the same way as the global attention coding and decoding network of the source domain, and the global attention coding and decoding network model M of the target domain _T By M _S Initializing the parameters;

2. The migration learning ship target segmentation method based on linear heating and snapshot integration according to claim 1, characterized in that:

the source domain data set is divided into a source domain training data set and a source domain testing data set according to the proportion of 4.

3. The migration learning ship target segmentation method based on linear heating and snapshot integration according to claim 1, characterized in that:

the expression of the cross entropy loss function is as;

wherein X represents the number of samples, C represents the number of classes,

label representing sample x, when sample x is class c

Is 1, is other

Is a group of a number of 0 s,

representing the probability that sample x is predicted as class c.

4. The migration learning ship target segmentation method based on linear heating and snapshot integration according to claim 1, characterized in that:

the best model M _S Comprises the following steps: keeping the current model with the highest segmentation accuracy on the source domain test set after each iteration until the maximum iteration number E is reached _S Then, obtaining a model with the highest segmentation accuracy on the source domain test data set, and taking the model as an optimal model M on the source domain _S 。

5. The migration learning ship target segmentation method based on linear heating and snapshot integration according to claim 1, characterized in that:

the specific steps of the step (5) are as follows:

(5a) Setting the number of iteration rounds E of the initial linear heating _l Initial learning rate lr _l Initial learning rate lr of cyclic cosine annealing in snapshot integration strategy _m Total number of rounds of iteration E during transformation period _m And the number of cycles n used;

(5b) Fine tuning training using k% of the target domain training dataset, front E _l Setting the learning rate using a linear heating strategy in round iterations, followed by E _m In the round iteration, a cyclic cosine annealing strategy is used for setting the learning rate, and in the cyclic cosine annealing stage, every E _m Store current E after n iterations _m Model with highest segmentation accuracy on target domain verification set in n iterations

(5c) Selection using snapshot integration strategy

The three models are used as the final segmentation model M _E Wherein i = n-3, n-2, n-1, when in segmentation, the segmentation results of the three models are averaged to obtain a final segmentation model M _E 。

6. The migration learning ship target segmentation method based on linear heating and snapshot integration as claimed in claim 5, wherein:

the values in the step (5 a) are as follows: number of iteration rounds E of initial linear heating _l An initial learning rate lr of 10 _l 0.0001, initial learning rate lr of the loop cosine annealing in the snapshot integration strategy _m 0.01, total number of rounds of iteration E during the transformation period _m The number of cycles used, n, was 5, which was 50.