CN108596108B - Aerial remote sensing image change detection method based on triple semantic relation learning - Google Patents

Aerial remote sensing image change detection method based on triple semantic relation learning Download PDF

Info

Publication number
CN108596108B
CN108596108B CN201810385526.4A CN201810385526A CN108596108B CN 108596108 B CN108596108 B CN 108596108B CN 201810385526 A CN201810385526 A CN 201810385526A CN 108596108 B CN108596108 B CN 108596108B
Authority
CN
China
Prior art keywords
data set
remote sensing
feature
image
triple
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810385526.4A
Other languages
Chinese (zh)
Other versions
CN108596108A (en
Inventor
陈克明
张梦雅
许光銮
闫梦龙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Electronics of CAS
Original Assignee
Institute of Electronics of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Electronics of CAS filed Critical Institute of Electronics of CAS
Priority to CN201810385526.4A priority Critical patent/CN108596108B/en
Publication of CN108596108A publication Critical patent/CN108596108A/en
Application granted granted Critical
Publication of CN108596108B publication Critical patent/CN108596108B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes
    • G06V20/13Satellite images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2411Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/30Noise filtering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Data Mining & Analysis (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Astronomy & Astrophysics (AREA)
  • Remote Sensing (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides an aerial photography remote sensing image change detection method based on triple semantic relation learning, which comprises the following steps: step A: constructing a double-path deep neural network model based on triple semantic relation learning; and B: training a two-way deep neural network model by using a training data set; and C: obtaining a feature representation of the test data set based on the test data set and the trained two-way deep neural network model; step D: calculating the Euclidean distance between two time phase images based on the feature representation of the test data set to obtain a difference image; and step E: and processing the difference image by using a threshold value method to obtain a change detection result. The method for detecting the change of the aerial photography remote sensing image based on the triple semantic relation learning automatically realizes the automatic selection of the multi-time-phase aerial photography remote sensing image characteristics by using the deep learning method, can express the image more comprehensively and deeply, does not need manual characteristic selection, is time-saving and labor-saving, and is convenient for engineering application.

Description

Aerial remote sensing image change detection method based on triple semantic relation learning
Technical Field
The disclosure relates to the technical field of remote sensing image processing, in particular to an aerial photography remote sensing image change detection method based on triple semantic relation learning.
Background
Human activities have a great influence on the environment on the earth surface, and the influence is reflected in various aspects such as environmental change, urban development and the like. Therefore, the real-time accurate acquisition of the change condition of the earth surface coverage is significant to environmental monitoring and resource management, and the change detection means that the change of the earth surface is determined by observing the distribution condition of the ground features in the same region at different times. The remote sensing image can provide earth surface information in a large range for a long time, and has important application in change detection. In recent years, with the development of aerial remote sensing technology, aerial images have a huge amount of data, so that aerial remote sensing image change detection is also an important subject in the field of remote sensing at present.
The method for detecting the change of the aerial remote sensing image mainly comprises two types: one type is that two time phase remote sensing images are classified respectively, and then the obtained classification class diagrams are compared and analyzed, so that a change detection result is obtained; the other is to compare and analyze multi-phase images to generate a difference image, and then analyze the difference image to obtain the result of change detection. The latter is the mainstream change detection method, and how to generate a high-quality difference map is an important research direction for change detection.
However, in implementing the present disclosure, the present inventors found that a common method for generating a disparity map is to compare extracted different phase features. The traditional change detection method is to manually extract features, and the feature expression force is not high; the change detection method combined with deep learning is characterized in that features are extracted through a deep neural network, the robustness and the abstraction are stronger, but the problems of semantic relation among pixels and multi-scale of a change area are ignored in the features extracted by the deep learning method, and a high-quality difference map cannot be generated.
BRIEF SUMMARY OF THE PRESENT DISCLOSURE
Technical problem to be solved
Based on the technical problems, the invention provides an aerial photography remote sensing image change detection method based on triple semantic relation learning, so as to solve the technical problems that in the prior art, the change detection method ignores the semantic relation between pixels and the multi-scale of a change area, and cannot generate a high-quality difference image.
(II) technical scheme
The invention provides an aerial photography remote sensing image change detection method based on triple semantic relation learning, which comprises the following steps: step A: constructing a double-path deep neural network model based on triple semantic relation learning; and B: training the two-way deep neural network model by using a training data set; and C: obtaining a feature representation of the test data set based on the test data set and the trained two-way deep neural network model; step D: calculating the Euclidean distance between two time phase images based on the feature representation of the test data set to obtain a difference image; and step E: and processing the difference image by using a threshold value method to obtain a change detection result.
In some embodiments of the present disclosure, the step a comprises: step A1: constructing the two-way deep neural network model for extracting features based on a 101-layer residual error network; step A2: acquiring a triple selection layer for training; and step a 3: a loss function Triplet loss layer is set.
In some embodiments of the present disclosure, the step a1 includes: step A1 a: replacing a full connection layer in a residual error network of 101 layers with a full convolution layer; step A1 b: enlarging the range of the receptive field by adopting a porous convolution; and step A1 c: and extracting features of different scales by adopting a spatial pyramid pooling method with holes.
In some embodiments of the present disclosure, the step a2 includes: step A2 a: and cascading the feature representations of the training data of the two time phases in the training data set after training into a feature map through a cascading layer, wherein the feature map satisfies the following formula:
fw(X)={fw(xij)|1≤i≤H,1≤j≤W}
wherein f isw(xij) The feature vectors representing the corresponding pixels on the feature icon (i, j), and H, W represent the height and width of the current feature map.
The step a2 further includes: step A2 b: obtaining a feature vector of each pixel on the feature map
Figure GDA0002501958140000021
Is marked as anchor; step A2 c: obtaining the feature vector which is the same as the anchor label and has the farthest distance
Figure GDA0002501958140000022
Marking as positive; step A2 d: acquiring a feature vector which is different from the anchor label and is closest to the anchor label
Figure GDA0002501958140000023
Marking as negative; step A2 e: the feature vector anchor, positive and negative are combined into a triple feature vector
Figure GDA0002501958140000024
In some embodiments of the present disclosure, in step a 3: the Triplet loss layer satisfies the following forward calculation formula:
Figure GDA0002501958140000031
wherein L (w) represents a loss function, LpRepresenting functions that only consider inter-class losses, LtRepresenting a conventional triplet loss function, P representing the number of pairs of images input to the network, w representing a network parameter, λ representing a weight to measure the two losses, m1Is a constant, m2Is a ratio m1A small constant.
The step a3 further includes: calculating the partial derivative of the Triplet loss layer according to the following formula:
Figure GDA0002501958140000032
wherein h is1(w) represents LpThe parameters are subjected to partial derivation h2(w) represents LtCalculating the deviation of the parameters;
Figure GDA0002501958140000033
Figure GDA0002501958140000034
Figure GDA0002501958140000035
Figure GDA0002501958140000036
in some embodiments of the present disclosure, in the step B, the two-way deep neural network is trained by using a stochastic gradient descent method using the training data set.
In some embodiments of the present disclosure, the step C comprises: and B, taking the test data set as the input of the trained double-path deep neural network model obtained in the step B, removing a cascade layer, a triple selection layer and a loss function layer at the tail of the double-path deep neural network model, and keeping the output of the multi-scale feature fusion layer as the depth feature representation obtained by learning on the test data set.
In some embodiments of the present disclosure, the step D comprises: and C, as for the feature representation of the test image acquired in the step C, restoring the resolution of the feature map to the size of the input image by using bilinear interpolation with the coefficient of 8, and calculating the Euclidean distance between the two feature maps to obtain a difference image.
In some embodiments of the present disclosure, the step E comprises: and D, processing the difference image obtained in the step D according to the following strategy: when d (x)mn) When the pixel point is larger than th, the pixel point is set to be 255; when d (x)mn) If the pixel point is less than th, setting the pixel point to be 0; d (x)mn) Indicates the distance value of the corresponding pixel at the difference image coordinates (m, n), and th indicates that the threshold is constant.
In some embodiments of the present disclosure, preprocessing multi-temporal remote sensing image data to be detected to obtain a training data set and a test data set; the preprocessing of the multi-temporal remote sensing image data to be detected comprises the following steps: carrying out relative radiation correction on a plurality of groups of images in the same region at different times by using a histogram matching method to eliminate radiation difference among the images at different time phases; and/or cutting or selecting the registered image to obtain a training data set and a test data set of the remote sensing image, wherein the cutting of the registered image comprises the following steps: cutting any area of each group of two-time phase images as a test area, and taking the rest area of the two-time phase images as a training area; and/or after the two time phase images of the training area are cut, horizontal and vertical turning and rotation change are carried out to obtain an expanded training data set.
(III) advantageous effects
According to the technical scheme, the aerial photography remote sensing image change detection method based on triple semantic relation learning has one or part of the following beneficial effects:
(1) the automatic selection of the multi-time-phase aerial remote sensing image characteristics is automatically realized by utilizing a deep learning method, the image can be more comprehensively and deeply expressed, manual characteristic selection is not needed, time and labor are saved, and engineering application is facilitated;
(2) the two-path network can process two images simultaneously, and two branches of the network share weight, which is equivalent to extracting features from the two images by using the same method;
(3) the problem of reduced resolution of the feature map caused by downsampling and pooling is solved by adopting the porous convolution, the reduction multiple of the picture size can be reduced, and meanwhile, the scope of a receptive field is effectively expanded under the condition of not increasing the number of parameters and the calculation amount, so that a denser feature map is obtained;
(4) the extraction of the feature representation of different scale change areas can be realized by utilizing a spatial pyramid pooling method with holes;
(5) by utilizing the improved triple loss function, the distance of the feature vector of the same type of label can be smaller than that of different types of labels by learning the semantic relation between the pixels, and the compactness of the pixels of the same type of label can be enhanced;
(6) a change detection result image is obtained through a threshold value method, the result image is closer to pixels of the same label through learning the semantic relation among the pixels, and noise points in the result image are fewer.
Drawings
Fig. 1 is a schematic flow chart of an aerial remote sensing image change detection method based on triple semantic relation learning according to an embodiment of the present disclosure.
Fig. 2 is a multi-temporal remote sensing image dataset to be detected selected by the embodiment of the present disclosure.
Fig. 3 is a graph of test results for an embodiment of the disclosure.
Detailed Description
In the method for detecting changes of aerial remote sensing images based on triple semantic relation learning, a deep learning method is used for learning semantic relations among image pixels so as to realize remote sensing image change detection, manual feature extraction is not needed, time and labor are saved, image features can be extracted more comprehensively and deeply, difference images are obtained, then the difference images are analyzed to obtain a change result image, and excellent results are obtained in the field of remote sensing image change detection.
For the purpose of promoting a better understanding of the objects, aspects and advantages of the present disclosure, reference is made to the following detailed description taken in conjunction with the accompanying drawings.
Fig. 1 is a schematic flow chart of an aerial remote sensing image change detection method based on triple semantic relation learning according to an embodiment of the present disclosure.
The embodiment of the disclosure provides an aerial photography remote sensing image change detection method based on triple semantic relation learning, as shown in fig. 1, including:
step A: constructing a double-path deep neural network model based on triple semantic relation learning;
and B: training a two-way deep neural network model by using a training data set;
and C: obtaining a feature representation of the test data set based on the test data set and the trained two-way deep neural network model;
step D: calculating the Euclidean distance between two time phase images based on the feature representation of the test data set to obtain a difference image;
and step E: the difference image is processed by a threshold value method to obtain a change detection result, the automatic selection of the multi-time-phase aerial remote sensing image characteristics is automatically realized by a deep learning method, the image can be more comprehensively and deeply expressed, manual characteristic selection is not needed, time and labor are saved, and engineering application is facilitated.
In some embodiments of the present disclosure, step a comprises:
step A1: constructing the two-way deep neural network model for extracting features based on a 101-layer residual error network;
step A2: acquiring a triple selection layer for training;
and step a 3: a loss function Triplet loss layer is set.
In the step a1, a training data set is used as an input to construct a two-way deep neural network model for extracting features, the network structure is a residual error network based on 101 layers, the two-way network can process two images simultaneously, and two branches of the network share weights, which is equivalent to extracting features from the two images by using the same method.
In practical application, the residual network of 101 layers can divide it into 5 convolutional blocks, the first convolutional block is composed of a convolutional layer and a pooling layer, where the convolutional filter convolutional kernel is 7 × 7, and the number of convolutional filters is 64; the second convolution block consists of 3 sets [ (1 × 1, 64) (3 × 3, 64) (1 × 1, 256) ] of convolution filters, where the former term (1 × 1, 3 × 3, 1 × 1) is the convolution filter convolution kernel and the latter term (64, 256) is the number of convolution filters; the third convolution block consists of 4 sets [ (1 × 1, 128) (3 × 3, 128) (1 × 1, 512) ] of convolution filters; the fourth convolution block consists of 23 sets [ (1 × 1, 256) (3 × 3, 256) (1 × 1, 1024) ] of convolution filters; the fifth convolution block consists of 3 sets [ (1 × 1, 512) (3 × 3, 512) (1 × 1, 2048) ] of convolution filters; the last layer is the fully connected layer. In order to make the network structure more suitable for change detection, the residual error network is improved, and the method specifically comprises the following steps:
step A1 a: the full link layers in the residual network of 101 layers are replaced by full convolutional layers, that is, the full link layers in the residual network of 101 layers are replaced by full convolutional layers with convolutional filter convolutional kernels of 1 × 1 and the number of the convolutional layers is 16.
Step A1 b: the problem of reduced resolution of the feature map caused by downsampling and pooling is solved by adopting the porous convolution, in the embodiment of the disclosure, the porous convolution can only reduce the size of the picture by 8 times instead of 32 times of the original size, and meanwhile, under the condition of not increasing the number of parameters and the calculation amount, the scope of the receptive field is effectively expanded, so that a denser feature map is obtained; in the disclosed embodiment, all of the four and five convolutional layers use a perforated convolutional layer, where the hole rate of the fourth convolutional block is 2 and the hole rate of the fifth convolutional block is 4.
Step A1 c: the multi-scale target problem is solved by adopting a porous spatial pyramid pooling method, and because different porosity rates correspond to different receptive field sizes, the characteristics of different scales can be extracted by adopting different porosity rates; in the embodiment of the disclosure, a spatial pyramid structure with holes is used behind the fifth convolution block, and a plurality of parallel convolution kernels with holes of different hole rates are used for processing, wherein the convolution kernels are all 3 × 3, and the hole rates are respectively 6, 12, 18, and 24, so that objects and context information of a plurality of scales can be captured, and finally, multi-scale features are fused, so as to obtain feature representation of an image.
As described above, in step a2, the feature representations of the training data at two time phases in the training data set are cascaded by the cascade layer into a feature map satisfying the following equation:
fw(X)={fw(xij)|1≤i≤H,1≤j≤W}
wherein f isw(xij) Representing the feature vector of the corresponding pixel on the feature icon (i, j), and H, W representing the height and width of the current feature map;
in some embodiments of the present disclosure, a feature vector for each pixel on a feature map is obtained
Figure GDA0002501958140000071
Is marked as anchor; and acquiring the feature vector which is the same as the anchor label and has the farthest distance
Figure GDA0002501958140000072
Marking as positive; then, feature vectors which are different from the anchor label and are closest to the anchor label are obtained
Figure GDA0002501958140000073
Marking as negative; finally, the feature vector anchorThe positive and negative component triplet feature vectors
Figure GDA0002501958140000074
And aiming at one anchor feature vector, obtaining a triple feature vector of each pixel on the feature map, wherein all the positive feature vectors can form a corresponding positive feature map, and all the negative feature vectors can form a corresponding negative feature map.
As described above, in step A3, the anchor profile, the positive profile, and the negative profile obtained in sub-step a2 are used to obtain the loss function Triplet loss. The similarity between feature vectors is obtained by euclidean distance. The traditional triple loss function only needs to satisfy the feature vector pairs of different labels in the triple
Figure GDA0002501958140000075
Greater than the same label
Figure GDA0002501958140000076
A specific value requirement. In the change detection, the situation that the label is fully changed or not changed occurs in the training sample, at this time, the negative characteristic diagram does not exist, and in addition, the traditional loss function does not have a pair
Figure GDA0002501958140000077
Is constrained, and therefore, the distance may be large and not meet the requirements of the embodiments of the present disclosure, so that the conventional triple loss function is not suitable for the embodiments of the present disclosure. The loss function provided in the embodiment of the disclosure adds pairs to the traditional loss function
Figure GDA0002501958140000078
The distance is controlled within a specific value range by the constraint of the distance, so that the closer the same type of tags in the feature space and the farther the different tags are.
In some embodiments of the present disclosure, the Triplet loss layer satisfies the following forward calculation formula:
Figure GDA0002501958140000081
wherein L (w) represents a loss function, LpRepresenting functions that only consider inter-class losses, LtRepresenting a conventional triplet loss function, P representing the number of pairs of images input to the network, w representing a network parameter, λ representing a weight to measure the two losses, m1Is a constant, m2Is a ratio m1A small constant. In some embodiments of the disclosure, λ takes the value 0.5, m1The value is 0.5, m2The value is 0.3.
In some embodiments of the present disclosure, step a3 includes: the triple loss layer partial derivative is calculated according to the following formula:
Figure GDA0002501958140000082
wherein h is1(w) represents LpThe parameters are subjected to partial derivation h2(w) represents LtCalculating the deviation of the parameters;
Figure GDA0002501958140000083
Figure GDA0002501958140000084
Figure GDA0002501958140000085
Figure GDA0002501958140000086
in some embodiments of the present disclosure, in step B, the two-way deep neural network is trained using a random gradient descent method using a training data set. The initialization and gradient change of two branches of the network are identical, and when the loss function of the whole deep neural network tends to be close to the local optimal solution, the training is completed. Because the number of network layers is large, the network is difficult to achieve the optimal state, and therefore the network is initialized to a pre-trained model.
In some embodiments of the disclosure, step C comprises: and B, taking the test data set as the input of the trained double-path deep neural network model obtained in the step B, removing a cascade layer, a triple selection layer and a loss function layer at the tail of the double-path deep neural network model, and keeping the output of the multi-scale feature fusion layer as the depth feature representation obtained by learning on the test data set.
In some embodiments of the present disclosure, step D comprises: and C, as for the feature representation of the test image acquired in the step C, restoring the resolution of the feature map to the size of the input image by using bilinear interpolation with the coefficient of 8, and calculating the Euclidean distance between the two feature maps to obtain a difference image.
In some embodiments of the disclosure, step E comprises: and D, processing the difference image obtained in the step D according to the following strategy:
when d (x)mn) When the pixel point is larger than th, the pixel point is set to be 255;
when d (x)mn) If the pixel point is less than th, setting the pixel point to be 0;
d(xmn) The distance value of the corresponding pixel on the difference image coordinate (m, n) is represented, the threshold value is represented as a constant, the change detection result image is obtained through the threshold value method, the result image is closer to the pixels with the same label through learning the semantic relation among the pixels, and the noise points in the result image are fewer.
The data sets adopted by the embodiment of the disclosure are a public data set SZADA data set and a TISZADA data set, and the number of the data sets is 12.
Fig. 2 is a multi-temporal remote sensing image dataset to be detected selected by the embodiment of the present disclosure. Part (a) in fig. 2 is a first group of remote sensing data sets of different phases in the SZTAKI AirChange Benchmark Set. Part (b) in fig. 2 is a third group of remote sensing data sets of different phases in the TISZADOB data Set in the SZTAKI AirChange Benchmark Set.
It should be noted that fig. 2 shows that the original data set image is subjected to grayscale processing, and in practical applications, the input image is a color image, and specifically refer to the contents of the public data set szaida data set and TISZADOB data set.
In some embodiments of the present disclosure, the multi-temporal remote sensing image data to be detected (as shown in fig. 2) is preprocessed to obtain a training data set and a test data set.
In some embodiments of the present disclosure, the preprocessing the multi-temporal remote sensing image data to be detected includes: carrying out relative radiation correction on a plurality of groups of images in the same region at different times by using a histogram matching method to eliminate radiation difference among the images at different time phases; and cutting or selecting the registered image to obtain a training data set and a test data set of the remote sensing image.
In some embodiments of the present disclosure, in cropping the registered images, any region of each set of two-time phase images is cropped as a test region; the remaining area of the two-phase images serves as a training area.
In some embodiments of the present disclosure, in the cropping of the registered image, after the two-time phase image of the training area is cropped, the horizontal and vertical flipping and the rotation change are performed to obtain an extended training data set.
The upper left area of each group of image pairs is cut out as a test area, and the size of the test area is 784 × 448. The rest area of the image pair is a training area, the training area is cut into pictures with the size of 113 x 113 in an overlapped mode to serve as training samples, and the cut training samples are horizontally and vertically turned and rotated by 90 degrees, 180 degrees and 270 degrees, so that the purpose of expanding the training samples is achieved, and the training data set is 2744 in total.
In practical applications, if the number of images included in the data set is large, one part of the data set may be directly selected as a training data set, and the other part of the data set may be selected as a testing data set.
Examples of applications of the present disclosure are further illustrated below:
fig. 3 is a graph of test results for an embodiment of the disclosure.
In fig. 3, (a) shows a standard reference result. Part (b) in fig. 3 is a result of the detection method provided by the embodiment of the present disclosure. Part (c) of fig. 3 is the result of the first comparative method. Part (d) of fig. 3 shows the results of the second comparative method. The upper half of FIG. 3 is the test results for the SZADA/1 data set. The lower half of FIG. 3 is the test results for the TISZADOB/3 data set.
In order to verify the effectiveness of the change detection method provided by the embodiment of the disclosure, the scheme of the invention is tested on a real test data set. Test results on a typical set of data sets are given here: the test data set is shown in fig. 2. In addition, the Change detection result obtained by the detection method provided by the embodiment of the present disclosure is compared with the Change detection results obtained by the two existing methods [ y.zhan, k.fu, m.yan, x.sun, h.w, and x.qiu, Change detection based on parameter relative network for optical information Images, IEEE Geoscience and Remote Sensing Letters, 14 (10): 1845-: 3384 and 3394, 2016 (comparison method two), and the corresponding test results are shown in fig. 3. From left to right in fig. 3, there are the standard reference result (a), the inventive method result (b), the comparative method one result (c) and the comparative method two result (d) in that order.
Here, analysis was performed using the quantitative change results for the change detection experiment result graph:
A. calculating the number of missed detections: the number of pixels which are changed in the reference image but are detected as unchanged in the experimental result image is recorded as a missed detection number FN;
B. calculating the number of false detections: the number of pixels which are not changed in the reference image but are detected as changed in the experimental result image is recorded as the error detection number FP;
C. calculating the accuracy rate Pr ═ TP/(TP + FP);
D. calculating the recall ratio Re ═ TP/(TP + FN);
E. the evaluation index F-measure ═ 2 × Re × Pr)/(Re + Pr, which measures the consistency of the experimental result graph and the reference graph.
Table 1: the method provided by the embodiment of the disclosure is compared with the performance indexes of the detection results of the comparison method I and the comparison method II
Figure GDA0002501958140000111
The performance indexes of the detection results of the method provided by the embodiment of the disclosure, the comparison method I and the comparison method II are shown in table 1. By observing and analyzing the fig. 2 and the table 1, on the szoda/1 test data, the change areas are small and dispersed, the result graph of the detection method provided by the embodiment of the disclosure has few noise points and good compactness, and the change areas with different scales are well detected. On the TISZADOB/3 test data, the change area is large and regular, the method provided by the embodiment of the disclosure also has a good change detection effect, and a good promotion is realized on the F-measure evaluation index.
So far, the embodiments of the present disclosure have been described in detail with reference to the accompanying drawings. It is to be noted that, in the attached drawings or in the description, the implementation modes not shown or described are all the modes known by the ordinary skilled person in the field of technology, and are not described in detail. Further, the above definitions of the various elements and methods are not limited to the various specific structures, shapes or arrangements of parts mentioned in the examples, which may be easily modified or substituted by those of ordinary skill in the art.
From the above description, those skilled in the art should clearly recognize that the method for detecting changes in aerial remote sensing images based on triple semantic relation learning provided by the present disclosure.
In conclusion, the method for detecting changes of aerial remote sensing images based on triple semantic relation learning provided by the disclosure learns the semantic relation among the pixels of the images by using a deep learning method, so as to realize the detection of changes of the remote sensing images, and can extract the image features more comprehensively and deeply on the basis of time and labor saving, so as to obtain excellent results in the field of detection of changes of the remote sensing images.
It should also be noted that directional terms, such as "upper", "lower", "front", "rear", "left", "right", and the like, used in the embodiments are only directions referring to the drawings, and are not intended to limit the scope of the present disclosure. Throughout the drawings, like elements are represented by like or similar reference numerals. Conventional structures or constructions will be omitted when they may obscure the understanding of the present disclosure.
And the shapes and sizes of the respective components in the drawings do not reflect actual sizes and proportions, but merely illustrate the contents of the embodiments of the present disclosure. Furthermore, in the claims, any reference signs placed between parentheses shall not be construed as limiting the claim.
Similarly, it should be appreciated that in the foregoing description of exemplary embodiments of the disclosure, various features of the disclosure are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the disclosure and aiding in the understanding of one or more of the various disclosed aspects. However, the disclosed method should not be interpreted as reflecting an intention that: that is, the claimed disclosure requires more features than are expressly recited in each claim. Rather, as the following claims reflect, disclosed aspects lie in less than all features of a single foregoing disclosed embodiment. Thus, the claims following the detailed description are hereby expressly incorporated into this detailed description, with each claim standing on its own as a separate embodiment of this disclosure.
The above-mentioned embodiments are intended to illustrate the objects, aspects and advantages of the present disclosure in further detail, and it should be understood that the above-mentioned embodiments are only illustrative of the present disclosure and are not intended to limit the present disclosure, and any modifications, equivalents, improvements and the like made within the spirit and principle of the present disclosure should be included in the scope of the present disclosure.

Claims (10)

1. An aerial photography remote sensing image change detection method based on triple semantic relation learning comprises the following steps:
step A: constructing a double-path deep neural network model based on triple semantic relation learning;
and B: training the two-way deep neural network model by using a training data set;
and C: obtaining a feature representation of the test data set based on the test data set and the trained two-way deep neural network model;
step D: calculating the Euclidean distance between two time phase images based on the feature representation of the test data set to obtain a difference image; and
step E: processing the difference image by using a threshold value method to obtain a change detection result;
the step A comprises the step of setting a loss function Triplet loss layer, wherein the Triplet loss layer meets the following forward calculation formula:
Figure FDA0002717808950000011
wherein L (w) represents a loss function, LpRepresenting functions that only consider inter-class losses, LtRepresenting a conventional triplet loss function, P representing the number of pairs of images input to the network, w representing a network parameter, λ representing a weight to measure the two losses, m1Is a constant, m2Is a ratio m1A small constant of the number of the first and second,
Figure FDA0002717808950000012
a feature vector representing each pixel on the input feature map,
Figure FDA0002717808950000013
is shown and
Figure FDA0002717808950000014
the labels of the corresponding feature vectors are the same and the distance is the greatestThe feature vectors of the far are,
Figure FDA0002717808950000015
is shown and
Figure FDA0002717808950000016
the labels of the corresponding feature vectors are different and the feature vector with the nearest distance is obtained.
2. The method for detecting changes in aerial remote sensing images based on triple semantic relationship learning according to claim 1, wherein the step A further comprises the following steps:
step A1: constructing the two-way deep neural network model for extracting features based on a 101-layer residual error network;
step A2: a triplet selection layer is obtained for training.
3. The method for detecting changes in aerial remote sensing images based on triple semantic relationship learning according to claim 2, wherein the step A1 comprises the following steps:
step A1 a: replacing a full connection layer in a residual error network of 101 layers with a full convolution layer;
step A1 b: enlarging the range of the receptive field by adopting a porous convolution; and
step A1 c: and extracting features of different scales by adopting a spatial pyramid pooling method with holes.
4. The method for detecting changes in aerial remote sensing images based on triple semantic relationship learning according to claim 2, wherein the step A2 comprises the following steps:
step A2 a: and cascading the feature representations of the training data of the two time phases in the training data set after training into a feature map through a cascading layer, wherein the feature map satisfies the following formula:
fw(X)={fw(xij)|1≤i≤H,1≤j≤W}
wherein f isw(xij) The feature vectors representing the corresponding pixels on the feature icon (i, j), H, W representing the current feature mapHeight and width;
step A2 b: obtaining a feature vector of each pixel on the feature map
Figure FDA0002717808950000021
Is marked as anchor;
step A2 c: obtaining the feature vector which is the same as the anchor label and has the farthest distance
Figure FDA0002717808950000022
Marking as positive;
step A2 d: acquiring a feature vector which is different from the anchor label and is closest to the anchor label
Figure FDA0002717808950000023
Marking as negative;
step A2 e: the feature vector anchor, positive and negative are combined into a triple feature vector
Figure FDA0002717808950000024
5. The method for detecting changes in aerial remote sensing images based on triple semantic relationship learning according to claim 2, wherein the setting of the loss function Triplet loss layer further comprises: calculating the partial derivative of the Triplet loss layer according to the following formula:
Figure FDA0002717808950000025
wherein h is1(w) represents LpThe parameters are subjected to partial derivation h2(w) represents LtCalculating the deviation of the parameters;
Figure FDA0002717808950000031
Figure FDA0002717808950000032
Figure FDA0002717808950000033
Figure FDA0002717808950000034
6. the method for detecting changes in aerial photography remote sensing images based on triple semantic relationship learning of claim 1, wherein in step B, a two-way deep neural network is trained by using a random gradient descent method and the training data set.
7. The method for detecting changes in aerial remote sensing images based on triple semantic relationship learning according to claim 1, wherein the step C comprises the following steps: and B, taking the test data set as the input of the trained double-path deep neural network model obtained in the step B, removing a cascade layer, a triple selection layer and a loss function layer at the tail of the double-path deep neural network model, and keeping the output of the multi-scale feature fusion layer as the depth feature representation obtained by learning on the test data set.
8. The method for detecting changes in aerial remote sensing images based on triple semantic relationship learning according to claim 1, wherein the step D comprises the following steps: and C, as for the feature representation of the test image acquired in the step C, restoring the resolution of the feature map to the size of the input image by using bilinear interpolation with the coefficient of 8, and calculating the Euclidean distance between the two feature maps to obtain a difference image.
9. The method for detecting changes in aerial remote sensing images based on triple semantic relationship learning according to claim 1, wherein the step E comprises the following steps: and D, processing the difference image obtained in the step D according to the following strategy:
when d (x)mn) When the pixel point is larger than th, the pixel point is set to be 255;
when d (x)mn) If the pixel point is less than th, setting the pixel point to be 0;
d(xmn) Indicates the distance value of the corresponding pixel at the difference image coordinates (m, n), and th indicates that the threshold is constant.
10. The method for detecting changes in aerial photography remote sensing images based on triple semantic relationship learning of claim 1, wherein multi-temporal remote sensing image data to be detected are preprocessed to obtain a training data set and a test data set;
the preprocessing of the multi-temporal remote sensing image data to be detected comprises the following steps:
carrying out relative radiation correction on a plurality of groups of images in the same region at different times by using a histogram matching method to eliminate radiation difference among the images at different time phases; and/or
Cutting or selecting the registered image to obtain a training data set and a test data set of the remote sensing image, wherein the cutting of the registered image comprises the following steps:
cutting any area of each group of two-time phase images as a test area, and taking the rest area of the two-time phase images as a training area; and/or
After the two time phase images of the training area are cut, horizontal and vertical turning and rotation change are carried out, and an expanded training data set is obtained.
CN201810385526.4A 2018-04-26 2018-04-26 Aerial remote sensing image change detection method based on triple semantic relation learning Active CN108596108B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810385526.4A CN108596108B (en) 2018-04-26 2018-04-26 Aerial remote sensing image change detection method based on triple semantic relation learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810385526.4A CN108596108B (en) 2018-04-26 2018-04-26 Aerial remote sensing image change detection method based on triple semantic relation learning

Publications (2)

Publication Number Publication Date
CN108596108A CN108596108A (en) 2018-09-28
CN108596108B true CN108596108B (en) 2021-02-23

Family

ID=63610200

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810385526.4A Active CN108596108B (en) 2018-04-26 2018-04-26 Aerial remote sensing image change detection method based on triple semantic relation learning

Country Status (1)

Country Link
CN (1) CN108596108B (en)

Families Citing this family (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109558806B (en) * 2018-11-07 2021-09-14 北京科技大学 Method for detecting high-resolution remote sensing image change
CN109635842A (en) * 2018-11-14 2019-04-16 平安科技(深圳)有限公司 A kind of image classification method, device and computer readable storage medium
CN109685141B (en) * 2018-12-25 2022-10-04 合肥哈工慧拣智能科技有限公司 Robot article sorting visual detection method based on deep neural network
CN109919320B (en) * 2019-01-23 2022-04-01 西北工业大学 Triplet network learning method based on semantic hierarchy
CN110059658B (en) * 2019-04-26 2020-11-24 北京理工大学 Remote sensing satellite image multi-temporal change detection method based on three-dimensional convolutional neural network
CN110120020A (en) * 2019-04-30 2019-08-13 西北工业大学 A kind of SAR image denoising method based on multiple dimensioned empty residual error attention network
CN110263644B (en) * 2019-05-21 2021-08-10 华南师范大学 Remote sensing image classification method, system, equipment and medium based on triplet network
CN110378237B (en) * 2019-06-21 2021-06-11 浙江工商大学 Facial expression recognition method based on depth measurement fusion network
CN111178213B (en) * 2019-12-23 2022-11-18 大连理工大学 Aerial photography vehicle detection method based on deep learning
CN112131968A (en) * 2020-09-01 2020-12-25 河海大学 Double-time-phase remote sensing image change detection method based on DCNN
CN112818966B (en) * 2021-04-16 2021-07-30 武汉光谷信息技术股份有限公司 Multi-mode remote sensing image data detection method and system

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106023154A (en) * 2016-05-09 2016-10-12 西北工业大学 Multi-temporal SAR image change detection method based on dual-channel convolutional neural network (CNN)
CN106875395A (en) * 2017-01-12 2017-06-20 西安电子科技大学 Super-pixel level SAR image change detection based on deep neural network
CN107194346A (en) * 2017-05-19 2017-09-22 福建师范大学 A kind of fatigue drive of car Forecasting Methodology

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106023154A (en) * 2016-05-09 2016-10-12 西北工业大学 Multi-temporal SAR image change detection method based on dual-channel convolutional neural network (CNN)
CN106875395A (en) * 2017-01-12 2017-06-20 西安电子科技大学 Super-pixel level SAR image change detection based on deep neural network
CN107194346A (en) * 2017-05-19 2017-09-22 福建师范大学 A kind of fatigue drive of car Forecasting Methodology

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
Learning Relationship for Very High Resolution Image Change Detection;Huo Chunlei等;《IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING》;20160831;第9卷(第8期);全文 *

Also Published As

Publication number Publication date
CN108596108A (en) 2018-09-28

Similar Documents

Publication Publication Date Title
CN108596108B (en) Aerial remote sensing image change detection method based on triple semantic relation learning
CN108573276B (en) Change detection method based on high-resolution remote sensing image
Wang et al. Scene classification of high-resolution remotely sensed image based on ResNet
CN113065558A (en) Lightweight small target detection method combined with attention mechanism
CN108596101B (en) Remote sensing image multi-target detection method based on convolutional neural network
CN110009010B (en) Wide-width optical remote sensing target detection method based on interest area redetection
JP6397379B2 (en) CHANGE AREA DETECTION DEVICE, METHOD, AND PROGRAM
CN110163213B (en) Remote sensing image segmentation method based on disparity map and multi-scale depth network model
CN108960404B (en) Image-based crowd counting method and device
Asokan et al. Machine learning based image processing techniques for satellite image analysis-a survey
CN107067405B (en) Remote sensing image segmentation method based on scale optimization
CN110309781B (en) House damage remote sensing identification method based on multi-scale spectrum texture self-adaptive fusion
CN109871823B (en) Satellite image ship detection method combining rotating frame and context information
WO2018076138A1 (en) Target detection method and apparatus based on large-scale high-resolution hyper-spectral image
CN110826428A (en) Ship detection method in high-speed SAR image
CN107909018B (en) Stable multi-mode remote sensing image matching method and system
CN106548169A (en) Fuzzy literal Enhancement Method and device based on deep neural network
CN113989662A (en) Remote sensing image fine-grained target identification method based on self-supervision mechanism
CN105389799B (en) SAR image object detection method based on sketch map and low-rank decomposition
CN111640116B (en) Aerial photography graph building segmentation method and device based on deep convolutional residual error network
Song et al. Extraction and reconstruction of curved surface buildings by contour clustering using airborne LiDAR data
CN113610070A (en) Landslide disaster identification method based on multi-source data fusion
CN108734200A (en) Human body target visible detection method and device based on BING features
Awad Toward robust segmentation results based on fusion methods for very high resolution optical image and lidar data
CN116883588A (en) Method and system for quickly reconstructing three-dimensional point cloud under large scene

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant