CN108596108B - Aerial remote sensing image change detection method based on triple semantic relation learning - Google Patents
Aerial remote sensing image change detection method based on triple semantic relation learning Download PDFInfo
- Publication number
- CN108596108B CN108596108B CN201810385526.4A CN201810385526A CN108596108B CN 108596108 B CN108596108 B CN 108596108B CN 201810385526 A CN201810385526 A CN 201810385526A CN 108596108 B CN108596108 B CN 108596108B
- Authority
- CN
- China
- Prior art keywords
- data set
- remote sensing
- feature
- image
- triple
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/10—Terrestrial scenes
- G06V20/13—Satellite images
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2411—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/30—Noise filtering
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Data Mining & Analysis (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Computer Vision & Pattern Recognition (AREA)
- General Engineering & Computer Science (AREA)
- Bioinformatics & Computational Biology (AREA)
- Artificial Intelligence (AREA)
- Astronomy & Astrophysics (AREA)
- Remote Sensing (AREA)
- Life Sciences & Earth Sciences (AREA)
- Image Analysis (AREA)
Abstract
The invention provides an aerial photography remote sensing image change detection method based on triple semantic relation learning, which comprises the following steps: step A: constructing a double-path deep neural network model based on triple semantic relation learning; and B: training a two-way deep neural network model by using a training data set; and C: obtaining a feature representation of the test data set based on the test data set and the trained two-way deep neural network model; step D: calculating the Euclidean distance between two time phase images based on the feature representation of the test data set to obtain a difference image; and step E: and processing the difference image by using a threshold value method to obtain a change detection result. The method for detecting the change of the aerial photography remote sensing image based on the triple semantic relation learning automatically realizes the automatic selection of the multi-time-phase aerial photography remote sensing image characteristics by using the deep learning method, can express the image more comprehensively and deeply, does not need manual characteristic selection, is time-saving and labor-saving, and is convenient for engineering application.
Description
Technical Field
The disclosure relates to the technical field of remote sensing image processing, in particular to an aerial photography remote sensing image change detection method based on triple semantic relation learning.
Background
Human activities have a great influence on the environment on the earth surface, and the influence is reflected in various aspects such as environmental change, urban development and the like. Therefore, the real-time accurate acquisition of the change condition of the earth surface coverage is significant to environmental monitoring and resource management, and the change detection means that the change of the earth surface is determined by observing the distribution condition of the ground features in the same region at different times. The remote sensing image can provide earth surface information in a large range for a long time, and has important application in change detection. In recent years, with the development of aerial remote sensing technology, aerial images have a huge amount of data, so that aerial remote sensing image change detection is also an important subject in the field of remote sensing at present.
The method for detecting the change of the aerial remote sensing image mainly comprises two types: one type is that two time phase remote sensing images are classified respectively, and then the obtained classification class diagrams are compared and analyzed, so that a change detection result is obtained; the other is to compare and analyze multi-phase images to generate a difference image, and then analyze the difference image to obtain the result of change detection. The latter is the mainstream change detection method, and how to generate a high-quality difference map is an important research direction for change detection.
However, in implementing the present disclosure, the present inventors found that a common method for generating a disparity map is to compare extracted different phase features. The traditional change detection method is to manually extract features, and the feature expression force is not high; the change detection method combined with deep learning is characterized in that features are extracted through a deep neural network, the robustness and the abstraction are stronger, but the problems of semantic relation among pixels and multi-scale of a change area are ignored in the features extracted by the deep learning method, and a high-quality difference map cannot be generated.
BRIEF SUMMARY OF THE PRESENT DISCLOSURE
Technical problem to be solved
Based on the technical problems, the invention provides an aerial photography remote sensing image change detection method based on triple semantic relation learning, so as to solve the technical problems that in the prior art, the change detection method ignores the semantic relation between pixels and the multi-scale of a change area, and cannot generate a high-quality difference image.
(II) technical scheme
The invention provides an aerial photography remote sensing image change detection method based on triple semantic relation learning, which comprises the following steps: step A: constructing a double-path deep neural network model based on triple semantic relation learning; and B: training the two-way deep neural network model by using a training data set; and C: obtaining a feature representation of the test data set based on the test data set and the trained two-way deep neural network model; step D: calculating the Euclidean distance between two time phase images based on the feature representation of the test data set to obtain a difference image; and step E: and processing the difference image by using a threshold value method to obtain a change detection result.
In some embodiments of the present disclosure, the step a comprises: step A1: constructing the two-way deep neural network model for extracting features based on a 101-layer residual error network; step A2: acquiring a triple selection layer for training; and step a 3: a loss function Triplet loss layer is set.
In some embodiments of the present disclosure, the step a1 includes: step A1 a: replacing a full connection layer in a residual error network of 101 layers with a full convolution layer; step A1 b: enlarging the range of the receptive field by adopting a porous convolution; and step A1 c: and extracting features of different scales by adopting a spatial pyramid pooling method with holes.
In some embodiments of the present disclosure, the step a2 includes: step A2 a: and cascading the feature representations of the training data of the two time phases in the training data set after training into a feature map through a cascading layer, wherein the feature map satisfies the following formula:
fw(X)={fw(xij)|1≤i≤H,1≤j≤W}
wherein f isw(xij) The feature vectors representing the corresponding pixels on the feature icon (i, j), and H, W represent the height and width of the current feature map.
The step a2 further includes: step A2 b: obtaining a feature vector of each pixel on the feature mapIs marked as anchor; step A2 c: obtaining the feature vector which is the same as the anchor label and has the farthest distanceMarking as positive; step A2 d: acquiring a feature vector which is different from the anchor label and is closest to the anchor labelMarking as negative; step A2 e: the feature vector anchor, positive and negative are combined into a triple feature vector
In some embodiments of the present disclosure, in step a 3: the Triplet loss layer satisfies the following forward calculation formula:
wherein L (w) represents a loss function, LpRepresenting functions that only consider inter-class losses, LtRepresenting a conventional triplet loss function, P representing the number of pairs of images input to the network, w representing a network parameter, λ representing a weight to measure the two losses, m1Is a constant, m2Is a ratio m1A small constant.
The step a3 further includes: calculating the partial derivative of the Triplet loss layer according to the following formula:
wherein h is1(w) represents LpThe parameters are subjected to partial derivation h2(w) represents LtCalculating the deviation of the parameters;
in some embodiments of the present disclosure, in the step B, the two-way deep neural network is trained by using a stochastic gradient descent method using the training data set.
In some embodiments of the present disclosure, the step C comprises: and B, taking the test data set as the input of the trained double-path deep neural network model obtained in the step B, removing a cascade layer, a triple selection layer and a loss function layer at the tail of the double-path deep neural network model, and keeping the output of the multi-scale feature fusion layer as the depth feature representation obtained by learning on the test data set.
In some embodiments of the present disclosure, the step D comprises: and C, as for the feature representation of the test image acquired in the step C, restoring the resolution of the feature map to the size of the input image by using bilinear interpolation with the coefficient of 8, and calculating the Euclidean distance between the two feature maps to obtain a difference image.
In some embodiments of the present disclosure, the step E comprises: and D, processing the difference image obtained in the step D according to the following strategy: when d (x)mn) When the pixel point is larger than th, the pixel point is set to be 255; when d (x)mn) If the pixel point is less than th, setting the pixel point to be 0; d (x)mn) Indicates the distance value of the corresponding pixel at the difference image coordinates (m, n), and th indicates that the threshold is constant.
In some embodiments of the present disclosure, preprocessing multi-temporal remote sensing image data to be detected to obtain a training data set and a test data set; the preprocessing of the multi-temporal remote sensing image data to be detected comprises the following steps: carrying out relative radiation correction on a plurality of groups of images in the same region at different times by using a histogram matching method to eliminate radiation difference among the images at different time phases; and/or cutting or selecting the registered image to obtain a training data set and a test data set of the remote sensing image, wherein the cutting of the registered image comprises the following steps: cutting any area of each group of two-time phase images as a test area, and taking the rest area of the two-time phase images as a training area; and/or after the two time phase images of the training area are cut, horizontal and vertical turning and rotation change are carried out to obtain an expanded training data set.
(III) advantageous effects
According to the technical scheme, the aerial photography remote sensing image change detection method based on triple semantic relation learning has one or part of the following beneficial effects:
(1) the automatic selection of the multi-time-phase aerial remote sensing image characteristics is automatically realized by utilizing a deep learning method, the image can be more comprehensively and deeply expressed, manual characteristic selection is not needed, time and labor are saved, and engineering application is facilitated;
(2) the two-path network can process two images simultaneously, and two branches of the network share weight, which is equivalent to extracting features from the two images by using the same method;
(3) the problem of reduced resolution of the feature map caused by downsampling and pooling is solved by adopting the porous convolution, the reduction multiple of the picture size can be reduced, and meanwhile, the scope of a receptive field is effectively expanded under the condition of not increasing the number of parameters and the calculation amount, so that a denser feature map is obtained;
(4) the extraction of the feature representation of different scale change areas can be realized by utilizing a spatial pyramid pooling method with holes;
(5) by utilizing the improved triple loss function, the distance of the feature vector of the same type of label can be smaller than that of different types of labels by learning the semantic relation between the pixels, and the compactness of the pixels of the same type of label can be enhanced;
(6) a change detection result image is obtained through a threshold value method, the result image is closer to pixels of the same label through learning the semantic relation among the pixels, and noise points in the result image are fewer.
Drawings
Fig. 1 is a schematic flow chart of an aerial remote sensing image change detection method based on triple semantic relation learning according to an embodiment of the present disclosure.
Fig. 2 is a multi-temporal remote sensing image dataset to be detected selected by the embodiment of the present disclosure.
Fig. 3 is a graph of test results for an embodiment of the disclosure.
Detailed Description
In the method for detecting changes of aerial remote sensing images based on triple semantic relation learning, a deep learning method is used for learning semantic relations among image pixels so as to realize remote sensing image change detection, manual feature extraction is not needed, time and labor are saved, image features can be extracted more comprehensively and deeply, difference images are obtained, then the difference images are analyzed to obtain a change result image, and excellent results are obtained in the field of remote sensing image change detection.
For the purpose of promoting a better understanding of the objects, aspects and advantages of the present disclosure, reference is made to the following detailed description taken in conjunction with the accompanying drawings.
Fig. 1 is a schematic flow chart of an aerial remote sensing image change detection method based on triple semantic relation learning according to an embodiment of the present disclosure.
The embodiment of the disclosure provides an aerial photography remote sensing image change detection method based on triple semantic relation learning, as shown in fig. 1, including:
step A: constructing a double-path deep neural network model based on triple semantic relation learning;
and B: training a two-way deep neural network model by using a training data set;
and C: obtaining a feature representation of the test data set based on the test data set and the trained two-way deep neural network model;
step D: calculating the Euclidean distance between two time phase images based on the feature representation of the test data set to obtain a difference image;
and step E: the difference image is processed by a threshold value method to obtain a change detection result, the automatic selection of the multi-time-phase aerial remote sensing image characteristics is automatically realized by a deep learning method, the image can be more comprehensively and deeply expressed, manual characteristic selection is not needed, time and labor are saved, and engineering application is facilitated.
In some embodiments of the present disclosure, step a comprises:
step A1: constructing the two-way deep neural network model for extracting features based on a 101-layer residual error network;
step A2: acquiring a triple selection layer for training;
and step a 3: a loss function Triplet loss layer is set.
In the step a1, a training data set is used as an input to construct a two-way deep neural network model for extracting features, the network structure is a residual error network based on 101 layers, the two-way network can process two images simultaneously, and two branches of the network share weights, which is equivalent to extracting features from the two images by using the same method.
In practical application, the residual network of 101 layers can divide it into 5 convolutional blocks, the first convolutional block is composed of a convolutional layer and a pooling layer, where the convolutional filter convolutional kernel is 7 × 7, and the number of convolutional filters is 64; the second convolution block consists of 3 sets [ (1 × 1, 64) (3 × 3, 64) (1 × 1, 256) ] of convolution filters, where the former term (1 × 1, 3 × 3, 1 × 1) is the convolution filter convolution kernel and the latter term (64, 256) is the number of convolution filters; the third convolution block consists of 4 sets [ (1 × 1, 128) (3 × 3, 128) (1 × 1, 512) ] of convolution filters; the fourth convolution block consists of 23 sets [ (1 × 1, 256) (3 × 3, 256) (1 × 1, 1024) ] of convolution filters; the fifth convolution block consists of 3 sets [ (1 × 1, 512) (3 × 3, 512) (1 × 1, 2048) ] of convolution filters; the last layer is the fully connected layer. In order to make the network structure more suitable for change detection, the residual error network is improved, and the method specifically comprises the following steps:
step A1 a: the full link layers in the residual network of 101 layers are replaced by full convolutional layers, that is, the full link layers in the residual network of 101 layers are replaced by full convolutional layers with convolutional filter convolutional kernels of 1 × 1 and the number of the convolutional layers is 16.
Step A1 b: the problem of reduced resolution of the feature map caused by downsampling and pooling is solved by adopting the porous convolution, in the embodiment of the disclosure, the porous convolution can only reduce the size of the picture by 8 times instead of 32 times of the original size, and meanwhile, under the condition of not increasing the number of parameters and the calculation amount, the scope of the receptive field is effectively expanded, so that a denser feature map is obtained; in the disclosed embodiment, all of the four and five convolutional layers use a perforated convolutional layer, where the hole rate of the fourth convolutional block is 2 and the hole rate of the fifth convolutional block is 4.
Step A1 c: the multi-scale target problem is solved by adopting a porous spatial pyramid pooling method, and because different porosity rates correspond to different receptive field sizes, the characteristics of different scales can be extracted by adopting different porosity rates; in the embodiment of the disclosure, a spatial pyramid structure with holes is used behind the fifth convolution block, and a plurality of parallel convolution kernels with holes of different hole rates are used for processing, wherein the convolution kernels are all 3 × 3, and the hole rates are respectively 6, 12, 18, and 24, so that objects and context information of a plurality of scales can be captured, and finally, multi-scale features are fused, so as to obtain feature representation of an image.
As described above, in step a2, the feature representations of the training data at two time phases in the training data set are cascaded by the cascade layer into a feature map satisfying the following equation:
fw(X)={fw(xij)|1≤i≤H,1≤j≤W}
wherein f isw(xij) Representing the feature vector of the corresponding pixel on the feature icon (i, j), and H, W representing the height and width of the current feature map;
in some embodiments of the present disclosure, a feature vector for each pixel on a feature map is obtainedIs marked as anchor; and acquiring the feature vector which is the same as the anchor label and has the farthest distanceMarking as positive; then, feature vectors which are different from the anchor label and are closest to the anchor label are obtainedMarking as negative; finally, the feature vector anchorThe positive and negative component triplet feature vectorsAnd aiming at one anchor feature vector, obtaining a triple feature vector of each pixel on the feature map, wherein all the positive feature vectors can form a corresponding positive feature map, and all the negative feature vectors can form a corresponding negative feature map.
As described above, in step A3, the anchor profile, the positive profile, and the negative profile obtained in sub-step a2 are used to obtain the loss function Triplet loss. The similarity between feature vectors is obtained by euclidean distance. The traditional triple loss function only needs to satisfy the feature vector pairs of different labels in the tripleGreater than the same labelA specific value requirement. In the change detection, the situation that the label is fully changed or not changed occurs in the training sample, at this time, the negative characteristic diagram does not exist, and in addition, the traditional loss function does not have a pairIs constrained, and therefore, the distance may be large and not meet the requirements of the embodiments of the present disclosure, so that the conventional triple loss function is not suitable for the embodiments of the present disclosure. The loss function provided in the embodiment of the disclosure adds pairs to the traditional loss functionThe distance is controlled within a specific value range by the constraint of the distance, so that the closer the same type of tags in the feature space and the farther the different tags are.
In some embodiments of the present disclosure, the Triplet loss layer satisfies the following forward calculation formula:
wherein L (w) represents a loss function, LpRepresenting functions that only consider inter-class losses, LtRepresenting a conventional triplet loss function, P representing the number of pairs of images input to the network, w representing a network parameter, λ representing a weight to measure the two losses, m1Is a constant, m2Is a ratio m1A small constant. In some embodiments of the disclosure, λ takes the value 0.5, m1The value is 0.5, m2The value is 0.3.
In some embodiments of the present disclosure, step a3 includes: the triple loss layer partial derivative is calculated according to the following formula:
wherein h is1(w) represents LpThe parameters are subjected to partial derivation h2(w) represents LtCalculating the deviation of the parameters;
in some embodiments of the present disclosure, in step B, the two-way deep neural network is trained using a random gradient descent method using a training data set. The initialization and gradient change of two branches of the network are identical, and when the loss function of the whole deep neural network tends to be close to the local optimal solution, the training is completed. Because the number of network layers is large, the network is difficult to achieve the optimal state, and therefore the network is initialized to a pre-trained model.
In some embodiments of the disclosure, step C comprises: and B, taking the test data set as the input of the trained double-path deep neural network model obtained in the step B, removing a cascade layer, a triple selection layer and a loss function layer at the tail of the double-path deep neural network model, and keeping the output of the multi-scale feature fusion layer as the depth feature representation obtained by learning on the test data set.
In some embodiments of the present disclosure, step D comprises: and C, as for the feature representation of the test image acquired in the step C, restoring the resolution of the feature map to the size of the input image by using bilinear interpolation with the coefficient of 8, and calculating the Euclidean distance between the two feature maps to obtain a difference image.
In some embodiments of the disclosure, step E comprises: and D, processing the difference image obtained in the step D according to the following strategy:
when d (x)mn) When the pixel point is larger than th, the pixel point is set to be 255;
when d (x)mn) If the pixel point is less than th, setting the pixel point to be 0;
d(xmn) The distance value of the corresponding pixel on the difference image coordinate (m, n) is represented, the threshold value is represented as a constant, the change detection result image is obtained through the threshold value method, the result image is closer to the pixels with the same label through learning the semantic relation among the pixels, and the noise points in the result image are fewer.
The data sets adopted by the embodiment of the disclosure are a public data set SZADA data set and a TISZADA data set, and the number of the data sets is 12.
Fig. 2 is a multi-temporal remote sensing image dataset to be detected selected by the embodiment of the present disclosure. Part (a) in fig. 2 is a first group of remote sensing data sets of different phases in the SZTAKI AirChange Benchmark Set. Part (b) in fig. 2 is a third group of remote sensing data sets of different phases in the TISZADOB data Set in the SZTAKI AirChange Benchmark Set.
It should be noted that fig. 2 shows that the original data set image is subjected to grayscale processing, and in practical applications, the input image is a color image, and specifically refer to the contents of the public data set szaida data set and TISZADOB data set.
In some embodiments of the present disclosure, the multi-temporal remote sensing image data to be detected (as shown in fig. 2) is preprocessed to obtain a training data set and a test data set.
In some embodiments of the present disclosure, the preprocessing the multi-temporal remote sensing image data to be detected includes: carrying out relative radiation correction on a plurality of groups of images in the same region at different times by using a histogram matching method to eliminate radiation difference among the images at different time phases; and cutting or selecting the registered image to obtain a training data set and a test data set of the remote sensing image.
In some embodiments of the present disclosure, in cropping the registered images, any region of each set of two-time phase images is cropped as a test region; the remaining area of the two-phase images serves as a training area.
In some embodiments of the present disclosure, in the cropping of the registered image, after the two-time phase image of the training area is cropped, the horizontal and vertical flipping and the rotation change are performed to obtain an extended training data set.
The upper left area of each group of image pairs is cut out as a test area, and the size of the test area is 784 × 448. The rest area of the image pair is a training area, the training area is cut into pictures with the size of 113 x 113 in an overlapped mode to serve as training samples, and the cut training samples are horizontally and vertically turned and rotated by 90 degrees, 180 degrees and 270 degrees, so that the purpose of expanding the training samples is achieved, and the training data set is 2744 in total.
In practical applications, if the number of images included in the data set is large, one part of the data set may be directly selected as a training data set, and the other part of the data set may be selected as a testing data set.
Examples of applications of the present disclosure are further illustrated below:
fig. 3 is a graph of test results for an embodiment of the disclosure.
In fig. 3, (a) shows a standard reference result. Part (b) in fig. 3 is a result of the detection method provided by the embodiment of the present disclosure. Part (c) of fig. 3 is the result of the first comparative method. Part (d) of fig. 3 shows the results of the second comparative method. The upper half of FIG. 3 is the test results for the SZADA/1 data set. The lower half of FIG. 3 is the test results for the TISZADOB/3 data set.
In order to verify the effectiveness of the change detection method provided by the embodiment of the disclosure, the scheme of the invention is tested on a real test data set. Test results on a typical set of data sets are given here: the test data set is shown in fig. 2. In addition, the Change detection result obtained by the detection method provided by the embodiment of the present disclosure is compared with the Change detection results obtained by the two existing methods [ y.zhan, k.fu, m.yan, x.sun, h.w, and x.qiu, Change detection based on parameter relative network for optical information Images, IEEE Geoscience and Remote Sensing Letters, 14 (10): 1845-: 3384 and 3394, 2016 (comparison method two), and the corresponding test results are shown in fig. 3. From left to right in fig. 3, there are the standard reference result (a), the inventive method result (b), the comparative method one result (c) and the comparative method two result (d) in that order.
Here, analysis was performed using the quantitative change results for the change detection experiment result graph:
A. calculating the number of missed detections: the number of pixels which are changed in the reference image but are detected as unchanged in the experimental result image is recorded as a missed detection number FN;
B. calculating the number of false detections: the number of pixels which are not changed in the reference image but are detected as changed in the experimental result image is recorded as the error detection number FP;
C. calculating the accuracy rate Pr ═ TP/(TP + FP);
D. calculating the recall ratio Re ═ TP/(TP + FN);
E. the evaluation index F-measure ═ 2 × Re × Pr)/(Re + Pr, which measures the consistency of the experimental result graph and the reference graph.
Table 1: the method provided by the embodiment of the disclosure is compared with the performance indexes of the detection results of the comparison method I and the comparison method II
The performance indexes of the detection results of the method provided by the embodiment of the disclosure, the comparison method I and the comparison method II are shown in table 1. By observing and analyzing the fig. 2 and the table 1, on the szoda/1 test data, the change areas are small and dispersed, the result graph of the detection method provided by the embodiment of the disclosure has few noise points and good compactness, and the change areas with different scales are well detected. On the TISZADOB/3 test data, the change area is large and regular, the method provided by the embodiment of the disclosure also has a good change detection effect, and a good promotion is realized on the F-measure evaluation index.
So far, the embodiments of the present disclosure have been described in detail with reference to the accompanying drawings. It is to be noted that, in the attached drawings or in the description, the implementation modes not shown or described are all the modes known by the ordinary skilled person in the field of technology, and are not described in detail. Further, the above definitions of the various elements and methods are not limited to the various specific structures, shapes or arrangements of parts mentioned in the examples, which may be easily modified or substituted by those of ordinary skill in the art.
From the above description, those skilled in the art should clearly recognize that the method for detecting changes in aerial remote sensing images based on triple semantic relation learning provided by the present disclosure.
In conclusion, the method for detecting changes of aerial remote sensing images based on triple semantic relation learning provided by the disclosure learns the semantic relation among the pixels of the images by using a deep learning method, so as to realize the detection of changes of the remote sensing images, and can extract the image features more comprehensively and deeply on the basis of time and labor saving, so as to obtain excellent results in the field of detection of changes of the remote sensing images.
It should also be noted that directional terms, such as "upper", "lower", "front", "rear", "left", "right", and the like, used in the embodiments are only directions referring to the drawings, and are not intended to limit the scope of the present disclosure. Throughout the drawings, like elements are represented by like or similar reference numerals. Conventional structures or constructions will be omitted when they may obscure the understanding of the present disclosure.
And the shapes and sizes of the respective components in the drawings do not reflect actual sizes and proportions, but merely illustrate the contents of the embodiments of the present disclosure. Furthermore, in the claims, any reference signs placed between parentheses shall not be construed as limiting the claim.
Similarly, it should be appreciated that in the foregoing description of exemplary embodiments of the disclosure, various features of the disclosure are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the disclosure and aiding in the understanding of one or more of the various disclosed aspects. However, the disclosed method should not be interpreted as reflecting an intention that: that is, the claimed disclosure requires more features than are expressly recited in each claim. Rather, as the following claims reflect, disclosed aspects lie in less than all features of a single foregoing disclosed embodiment. Thus, the claims following the detailed description are hereby expressly incorporated into this detailed description, with each claim standing on its own as a separate embodiment of this disclosure.
The above-mentioned embodiments are intended to illustrate the objects, aspects and advantages of the present disclosure in further detail, and it should be understood that the above-mentioned embodiments are only illustrative of the present disclosure and are not intended to limit the present disclosure, and any modifications, equivalents, improvements and the like made within the spirit and principle of the present disclosure should be included in the scope of the present disclosure.
Claims (10)
1. An aerial photography remote sensing image change detection method based on triple semantic relation learning comprises the following steps:
step A: constructing a double-path deep neural network model based on triple semantic relation learning;
and B: training the two-way deep neural network model by using a training data set;
and C: obtaining a feature representation of the test data set based on the test data set and the trained two-way deep neural network model;
step D: calculating the Euclidean distance between two time phase images based on the feature representation of the test data set to obtain a difference image; and
step E: processing the difference image by using a threshold value method to obtain a change detection result;
the step A comprises the step of setting a loss function Triplet loss layer, wherein the Triplet loss layer meets the following forward calculation formula:
wherein L (w) represents a loss function, LpRepresenting functions that only consider inter-class losses, LtRepresenting a conventional triplet loss function, P representing the number of pairs of images input to the network, w representing a network parameter, λ representing a weight to measure the two losses, m1Is a constant, m2Is a ratio m1A small constant of the number of the first and second,a feature vector representing each pixel on the input feature map,is shown andthe labels of the corresponding feature vectors are the same and the distance is the greatestThe feature vectors of the far are,is shown andthe labels of the corresponding feature vectors are different and the feature vector with the nearest distance is obtained.
2. The method for detecting changes in aerial remote sensing images based on triple semantic relationship learning according to claim 1, wherein the step A further comprises the following steps:
step A1: constructing the two-way deep neural network model for extracting features based on a 101-layer residual error network;
step A2: a triplet selection layer is obtained for training.
3. The method for detecting changes in aerial remote sensing images based on triple semantic relationship learning according to claim 2, wherein the step A1 comprises the following steps:
step A1 a: replacing a full connection layer in a residual error network of 101 layers with a full convolution layer;
step A1 b: enlarging the range of the receptive field by adopting a porous convolution; and
step A1 c: and extracting features of different scales by adopting a spatial pyramid pooling method with holes.
4. The method for detecting changes in aerial remote sensing images based on triple semantic relationship learning according to claim 2, wherein the step A2 comprises the following steps:
step A2 a: and cascading the feature representations of the training data of the two time phases in the training data set after training into a feature map through a cascading layer, wherein the feature map satisfies the following formula:
fw(X)={fw(xij)|1≤i≤H,1≤j≤W}
wherein f isw(xij) The feature vectors representing the corresponding pixels on the feature icon (i, j), H, W representing the current feature mapHeight and width;
step A2 c: obtaining the feature vector which is the same as the anchor label and has the farthest distanceMarking as positive;
step A2 d: acquiring a feature vector which is different from the anchor label and is closest to the anchor labelMarking as negative;
5. The method for detecting changes in aerial remote sensing images based on triple semantic relationship learning according to claim 2, wherein the setting of the loss function Triplet loss layer further comprises: calculating the partial derivative of the Triplet loss layer according to the following formula:
wherein h is1(w) represents LpThe parameters are subjected to partial derivation h2(w) represents LtCalculating the deviation of the parameters;
6. the method for detecting changes in aerial photography remote sensing images based on triple semantic relationship learning of claim 1, wherein in step B, a two-way deep neural network is trained by using a random gradient descent method and the training data set.
7. The method for detecting changes in aerial remote sensing images based on triple semantic relationship learning according to claim 1, wherein the step C comprises the following steps: and B, taking the test data set as the input of the trained double-path deep neural network model obtained in the step B, removing a cascade layer, a triple selection layer and a loss function layer at the tail of the double-path deep neural network model, and keeping the output of the multi-scale feature fusion layer as the depth feature representation obtained by learning on the test data set.
8. The method for detecting changes in aerial remote sensing images based on triple semantic relationship learning according to claim 1, wherein the step D comprises the following steps: and C, as for the feature representation of the test image acquired in the step C, restoring the resolution of the feature map to the size of the input image by using bilinear interpolation with the coefficient of 8, and calculating the Euclidean distance between the two feature maps to obtain a difference image.
9. The method for detecting changes in aerial remote sensing images based on triple semantic relationship learning according to claim 1, wherein the step E comprises the following steps: and D, processing the difference image obtained in the step D according to the following strategy:
when d (x)mn) When the pixel point is larger than th, the pixel point is set to be 255;
when d (x)mn) If the pixel point is less than th, setting the pixel point to be 0;
d(xmn) Indicates the distance value of the corresponding pixel at the difference image coordinates (m, n), and th indicates that the threshold is constant.
10. The method for detecting changes in aerial photography remote sensing images based on triple semantic relationship learning of claim 1, wherein multi-temporal remote sensing image data to be detected are preprocessed to obtain a training data set and a test data set;
the preprocessing of the multi-temporal remote sensing image data to be detected comprises the following steps:
carrying out relative radiation correction on a plurality of groups of images in the same region at different times by using a histogram matching method to eliminate radiation difference among the images at different time phases; and/or
Cutting or selecting the registered image to obtain a training data set and a test data set of the remote sensing image, wherein the cutting of the registered image comprises the following steps:
cutting any area of each group of two-time phase images as a test area, and taking the rest area of the two-time phase images as a training area; and/or
After the two time phase images of the training area are cut, horizontal and vertical turning and rotation change are carried out, and an expanded training data set is obtained.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810385526.4A CN108596108B (en) | 2018-04-26 | 2018-04-26 | Aerial remote sensing image change detection method based on triple semantic relation learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810385526.4A CN108596108B (en) | 2018-04-26 | 2018-04-26 | Aerial remote sensing image change detection method based on triple semantic relation learning |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108596108A CN108596108A (en) | 2018-09-28 |
CN108596108B true CN108596108B (en) | 2021-02-23 |
Family
ID=63610200
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810385526.4A Active CN108596108B (en) | 2018-04-26 | 2018-04-26 | Aerial remote sensing image change detection method based on triple semantic relation learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108596108B (en) |
Families Citing this family (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109558806B (en) * | 2018-11-07 | 2021-09-14 | 北京科技大学 | Method for detecting high-resolution remote sensing image change |
CN109635842A (en) * | 2018-11-14 | 2019-04-16 | 平安科技(深圳)有限公司 | A kind of image classification method, device and computer readable storage medium |
CN109685141B (en) * | 2018-12-25 | 2022-10-04 | 合肥哈工慧拣智能科技有限公司 | Robot article sorting visual detection method based on deep neural network |
CN109919320B (en) * | 2019-01-23 | 2022-04-01 | 西北工业大学 | Triplet network learning method based on semantic hierarchy |
CN110059658B (en) * | 2019-04-26 | 2020-11-24 | 北京理工大学 | Remote sensing satellite image multi-temporal change detection method based on three-dimensional convolutional neural network |
CN110120020A (en) * | 2019-04-30 | 2019-08-13 | 西北工业大学 | A kind of SAR image denoising method based on multiple dimensioned empty residual error attention network |
CN110263644B (en) * | 2019-05-21 | 2021-08-10 | 华南师范大学 | Remote sensing image classification method, system, equipment and medium based on triplet network |
CN110378237B (en) * | 2019-06-21 | 2021-06-11 | 浙江工商大学 | Facial expression recognition method based on depth measurement fusion network |
CN111178213B (en) * | 2019-12-23 | 2022-11-18 | 大连理工大学 | Aerial photography vehicle detection method based on deep learning |
CN112131968A (en) * | 2020-09-01 | 2020-12-25 | 河海大学 | Double-time-phase remote sensing image change detection method based on DCNN |
CN112818966B (en) * | 2021-04-16 | 2021-07-30 | 武汉光谷信息技术股份有限公司 | Multi-mode remote sensing image data detection method and system |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106023154A (en) * | 2016-05-09 | 2016-10-12 | 西北工业大学 | Multi-temporal SAR image change detection method based on dual-channel convolutional neural network (CNN) |
CN106875395A (en) * | 2017-01-12 | 2017-06-20 | 西安电子科技大学 | Super-pixel level SAR image change detection based on deep neural network |
CN107194346A (en) * | 2017-05-19 | 2017-09-22 | 福建师范大学 | A kind of fatigue drive of car Forecasting Methodology |
-
2018
- 2018-04-26 CN CN201810385526.4A patent/CN108596108B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106023154A (en) * | 2016-05-09 | 2016-10-12 | 西北工业大学 | Multi-temporal SAR image change detection method based on dual-channel convolutional neural network (CNN) |
CN106875395A (en) * | 2017-01-12 | 2017-06-20 | 西安电子科技大学 | Super-pixel level SAR image change detection based on deep neural network |
CN107194346A (en) * | 2017-05-19 | 2017-09-22 | 福建师范大学 | A kind of fatigue drive of car Forecasting Methodology |
Non-Patent Citations (1)
Title |
---|
Learning Relationship for Very High Resolution Image Change Detection;Huo Chunlei等;《IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING》;20160831;第9卷(第8期);全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN108596108A (en) | 2018-09-28 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108596108B (en) | Aerial remote sensing image change detection method based on triple semantic relation learning | |
CN108573276B (en) | Change detection method based on high-resolution remote sensing image | |
Wang et al. | Scene classification of high-resolution remotely sensed image based on ResNet | |
CN113065558A (en) | Lightweight small target detection method combined with attention mechanism | |
CN108596101B (en) | Remote sensing image multi-target detection method based on convolutional neural network | |
CN110009010B (en) | Wide-width optical remote sensing target detection method based on interest area redetection | |
JP6397379B2 (en) | CHANGE AREA DETECTION DEVICE, METHOD, AND PROGRAM | |
CN110163213B (en) | Remote sensing image segmentation method based on disparity map and multi-scale depth network model | |
CN108960404B (en) | Image-based crowd counting method and device | |
Asokan et al. | Machine learning based image processing techniques for satellite image analysis-a survey | |
CN107067405B (en) | Remote sensing image segmentation method based on scale optimization | |
CN110309781B (en) | House damage remote sensing identification method based on multi-scale spectrum texture self-adaptive fusion | |
CN109871823B (en) | Satellite image ship detection method combining rotating frame and context information | |
WO2018076138A1 (en) | Target detection method and apparatus based on large-scale high-resolution hyper-spectral image | |
CN110826428A (en) | Ship detection method in high-speed SAR image | |
CN107909018B (en) | Stable multi-mode remote sensing image matching method and system | |
CN106548169A (en) | Fuzzy literal Enhancement Method and device based on deep neural network | |
CN113989662A (en) | Remote sensing image fine-grained target identification method based on self-supervision mechanism | |
CN105389799B (en) | SAR image object detection method based on sketch map and low-rank decomposition | |
CN111640116B (en) | Aerial photography graph building segmentation method and device based on deep convolutional residual error network | |
Song et al. | Extraction and reconstruction of curved surface buildings by contour clustering using airborne LiDAR data | |
CN113610070A (en) | Landslide disaster identification method based on multi-source data fusion | |
CN108734200A (en) | Human body target visible detection method and device based on BING features | |
Awad | Toward robust segmentation results based on fusion methods for very high resolution optical image and lidar data | |
CN116883588A (en) | Method and system for quickly reconstructing three-dimensional point cloud under large scene |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |