CN114419406A

CN114419406A - Image change detection method, training method, device and computer equipment

Info

Publication number: CN114419406A
Application number: CN202111516637.2A
Authority: CN
Inventors: 郑润蓝; 佘楚云; 张瑞; 黄加祺; 周永光; 王兵; 曹建伟
Original assignee: Shenzhen Power Supply Bureau Co Ltd
Current assignee: Shenzhen Power Supply Bureau Co Ltd
Priority date: 2021-12-13
Filing date: 2021-12-13
Publication date: 2022-04-29

Abstract

The application relates to an image change detection method, a training method, an apparatus, a computer device, a storage medium and a computer program product. The method comprises the following steps: acquiring an image to be processed; the image to be processed comprises a first image and a second image which correspond to the same scene at different moments; inputting an image to be processed into a preset image change detection network for detection, and generating a target characteristic diagram; the preset image change detection network comprises a characteristic graph pyramid network, and the characteristic graph pyramid network is used for fusing characteristic graphs of multiple scales of the image to be processed to generate a target characteristic graph; and determining an image change detection result between the first image and the second image according to the target feature map. The method can provide image change detection accuracy.

Description

Image change detection method, training method, device and computer equipment

Technical Field

The present application relates to the field of image processing technologies, and in particular, to an image change detection method, a training method, an apparatus, a computer device, a storage medium, and a computer program product.

Background

With the continuous development of image processing related technologies, image change detection technologies have been widely applied in many fields such as video monitoring, environmental monitoring, forest resource investigation, land utilization, urban planning and layout, and the like. The image change detection technology can detect changes of the same scene in different time periods.

When image transformation detection is carried out on the same scene, a real-time shooting image of the scene is compared with a standard image of the scene, and the difference between the images is determined. And then the image change information can be obtained according to the difference between the images.

In the prior art, when detecting image change of a same scene, a real-time shot image of the scene and a standard image of the scene are generally input into a conventional convolutional neural network for processing, so as to determine a difference between the images, and further acquire image change information. However, the accuracy of the image change information obtained by detecting the image change by the above method is low.

Disclosure of Invention

In view of the above, it is necessary to provide an image change detection method, a training method, an apparatus, a computer device, a computer readable storage medium, and a computer program product capable of providing image change detection accuracy in view of the above technical problems.

In a first aspect, the present application provides an image change detection method. The method comprises the following steps:

acquiring an image to be processed; the image to be processed comprises a first image and a second image which correspond to the same scene at different moments; inputting an image to be processed into a preset image change detection network for detection, and generating a target characteristic diagram; the preset image change detection network comprises a characteristic graph pyramid network, and the characteristic graph pyramid network is used for fusing characteristic graphs of multiple scales of the image to be processed to generate a target characteristic graph; and determining an image change detection result between the first image and the second image according to the target feature map.

In one embodiment, the preset image change detection network further comprises a feature extraction network; inputting an image to be processed into a preset image change detection network for detection, and generating a target characteristic diagram, wherein the method comprises the following steps:

inputting an image to be processed into a feature extraction network for feature extraction, and generating feature maps with multiple scales; and inputting the feature maps of multiple scales into the feature map pyramid network, and fusing the feature maps of multiple scales to generate a target feature map.

In one embodiment, the feature extraction network comprises an encoding network and a decoding network, wherein the encoding network adopts a residual error network; inputting an image to be processed into a feature extraction network for feature extraction, and generating feature maps with multiple scales, wherein the feature maps comprise:

inputting an image to be processed into a residual error network for processing to generate first feature maps with multiple scales; and inputting the first feature maps of multiple scales into a decoding network for processing to generate second feature maps of multiple scales.

In one embodiment, determining the image change detection result between the first image and the second image according to the target feature map comprises:

inputting the target characteristic graph into a preset classifier for classification, and generating categories corresponding to all pixel points in the image to be processed and the probability of the pixel points under the categories; and generating an image change detection result between the first image and the second image according to the category corresponding to each pixel point and the probability of the pixel point under the category for each pixel point in the image to be processed.

In one embodiment, the categories include a change class and a non-change class; generating an image change detection result between the first image and the second image according to the category corresponding to each pixel point and the probability of the pixel point under the category, wherein the image change detection result comprises the following steps:

determining pixel points belonging to the variation class from the pixel points according to the class corresponding to the pixel points and the probability of the pixel points under the class; denoising the pixel points belonging to the change class to generate target pixel points belonging to the change class; and generating an image change detection result between the first image and the second image according to the target pixel points belonging to the change class.

In one embodiment, inputting an image to be processed into a preset image change detection network for detection, and generating a target feature map, further includes:

aligning the first image and the second image to generate an aligned first image and an aligned second image; performing difference processing on the aligned first image and the aligned second image to generate a difference image; and inputting the difference image into a preset image change detection network for detection to generate a target characteristic diagram.

In a second aspect, the present application provides a training method for an image change detection network. The method comprises the following steps:

acquiring a training set; the training set comprises a first image, a second image, and image change detection labeling results between the first image and the second image which correspond to the same scene at different moments; inputting the first image and the second image into an initial image change detection network for training to generate an image change detection prediction result; calculating the value of the loss function according to the image change detection prediction result and the image change detection annotation result; the loss function comprises a joint loss function, and the joint loss function comprises a cross entropy loss function and a similarity measurement function; and adjusting parameters of the initial image change detection network according to the value of the loss function to generate a preset image change detection network.

In a third aspect, the present application further provides an image change detection apparatus. The device comprises:

the first acquisition module is used for acquiring an image to be processed; the image to be processed comprises a first image and a second image which correspond to the same scene at different moments;

the first generation module is used for inputting the image to be processed into a preset image change detection network for detection and generating a target characteristic diagram; the preset image change detection network comprises a characteristic graph pyramid network, and the characteristic graph pyramid network is used for fusing characteristic graphs of multiple scales of the image to be processed to generate a target characteristic graph;

and the determining module is used for determining an image change detection result between the first image and the second image according to the target characteristic diagram.

In a fourth aspect, the present application further provides a computer device. The computer device comprises a memory storing a computer program and a processor implementing the method steps in any of the embodiments of the first aspect described above when executing the computer program.

In a fifth aspect, the present application further provides a computer-readable storage medium. The computer-readable storage medium having stored thereon a computer program which, when being executed by a processor, carries out the method steps of any of the embodiments of the first aspect described above.

In a sixth aspect, the present application further provides a computer program product. The computer program product comprising a computer program that when executed by a processor performs the method steps of any of the embodiments of the first aspect described above.

The image change detection method, the training method, the device, the computer equipment, the storage medium and the computer program product are used for obtaining the image to be processed; inputting an image to be processed into a preset image change detection network for detection, and generating a target characteristic diagram; the preset image change detection network comprises a characteristic graph pyramid network, and the characteristic graph pyramid network is used for fusing characteristic graphs of multiple scales of the image to be processed to generate a target characteristic graph; and determining an image change detection result between the first image and the second image according to the target feature map. In the technical scheme provided by the embodiment of the application, the feature maps of multiple scales of the image to be processed can be fused through the feature map pyramid network, so that the obtained target feature map integrates the image features of multiple scales, the accuracy of extracting the feature map is improved, and the accuracy of image change detection is also improved when the image change detection is carried out according to the target feature map.

Drawings

FIG. 1 is a diagram illustrating an internal structure of a computer device according to an embodiment;

FIG. 2 is a schematic flow chart diagram illustrating an exemplary method for detecting image changes;

FIG. 3 is a schematic flow chart for generating a target feature map in one embodiment;

FIG. 4 is a diagram illustrating the structure of an underlying unet network in one embodiment;

FIG. 5 is a diagram illustrating feature fusion performed by the FPN network, according to one embodiment;

FIG. 6 is a schematic flow diagram for generating a multiple scale feature map in one embodiment;

FIG. 7 is a diagram illustrating the structure of a residual network in one embodiment;

FIG. 8 is a schematic diagram of a process for determining an image change detection result according to one embodiment;

FIG. 9 is a schematic flow chart illustrating optimization of image change detection results according to one embodiment;

FIG. 10 is a schematic flow chart of generating an input image in one embodiment;

FIG. 11 is a schematic diagram illustrating an embodiment of a process for generating a default image change detection network;

FIG. 12 is an overall architecture diagram of image change detection in one embodiment;

fig. 13 is a block diagram showing the structure of an image change detection apparatus according to an embodiment.

Detailed Description

In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.

The image change detection method provided by the application can be applied to computer equipment, the computer equipment can be a server or a terminal, wherein the server can be one server or a server cluster consisting of a plurality of servers.

Taking the example of a computer device being a server, fig. 1 shows a block diagram of a server, which, as shown in fig. 1, comprises a processor, a memory and a network interface connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, a computer program, and a database. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The database of the computer device is used for storing image change detection data. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement an image change detection method.

Those skilled in the art will appreciate that the architecture shown in fig. 1 is a block diagram of only a portion of the architecture associated with the subject application, and does not constitute a limitation on the servers to which the subject application applies, and that servers may alternatively include more or fewer components than those shown, or combine certain components, or have a different arrangement of components.

The execution subject of the embodiments of the present application may be a computer device, or may be an image change detection apparatus, and the following method embodiments will be described with reference to the computer device as the execution subject.

In one embodiment, as shown in fig. 2, which shows a flowchart of image change detection provided by an embodiment of the present application, the method may include the following steps:

step 220, acquiring an image to be processed; the image to be processed comprises a first image and a second image which correspond to the same scene at different moments.

One of the first image and the second image in the image to be processed can be used as a reference image, the other can be used as an image to be detected, and the difference between the two images can be determined by comparing the image to be detected with the reference image. The first image and the second image corresponding to the same scene at different moments can come from a fixed camera at the same geographic position and at the same shooting angle, only the shooting time is different, and secondary factors such as illumination and the like can be ignored; shooting can also be performed through a moving robot or unmanned aerial vehicle. The acquired image can be directly used as an image to be processed, and the image to be processed can also be obtained after image preprocessing operation, wherein the image preprocessing operation can include but is not limited to image denoising, image size scaling and the like.

Step 240, inputting the image to be processed into a preset image change detection network for detection, and generating a target characteristic diagram; the preset image change detection network comprises a feature map pyramid network, and the feature map pyramid network is used for fusing feature maps of multiple scales of the image to be processed to generate a target feature map.

The preset image change detection network is obtained by training a to-be-processed image sample set, the to-be-processed image sample set can comprise a first image, a second image, an image change detection labeling result between the first image and the second image, which correspond to the same scene at different moments, and the preset image change detection network is used for extracting the features of the to-be-processed image. The target feature map may be generated by inputting the image to be processed into a preset image change detection network for detection, where the preset image change detection network may be a network based on a semantic segmentation network, and the semantic segmentation network may be a unet network or another type of network, which is not specifically limited in this embodiment. The preset image change detection network can comprise a Feature map Pyramid network (FPN), the FPN mainly solves the multi-scale problem in object detection, and the performance of small object detection is greatly improved through simple network connection change under the condition that the original model calculation amount is not increased basically. The feature maps of multiple scales of the image to be processed can be fused through the FPN network so as to generate a target feature map, wherein the feature maps of multiple scales can be obtained through feature extraction in a preset image change detection network.

And step 260, determining an image change detection result between the first image and the second image according to the target feature map.

After the target feature map is generated, all pixel points on the target feature map can be classified, so that each pixel point is determined to belong to a change category or a non-change category, and finally, an image change detection result between the first image and the second image is generated according to the pixel points of different categories. The image change detection result may be composed of pixel points of all categories, or may be composed of pixel points belonging to only the change category. Optionally, the target feature map may be transmitted to a preset classifier to classify each pixel point on the target feature map, or other manners may be adopted to classify each pixel point on the target feature map, and the preset classifier may be a softmax classifier or other types of classifiers.

In the embodiment, the image to be processed is obtained; inputting an image to be processed into a preset image change detection network for detection, and generating a target characteristic diagram; the preset image change detection network comprises a characteristic graph pyramid network, and the characteristic graph pyramid network is used for fusing characteristic graphs of multiple scales of the image to be processed to generate a target characteristic graph; and determining an image change detection result between the first image and the second image according to the target feature map. The characteristic graphs of the images to be processed in multiple scales can be fused through the characteristic graph pyramid network, so that the obtained target characteristic graph integrates the image characteristics in multiple scales, the accuracy of extracting the characteristic graph is improved, and the accuracy of image change detection is improved when image change detection is carried out according to the target characteristic graph.

In an embodiment, the preset image change detection network further includes a feature extraction network, as shown in fig. 3, which shows a flowchart of image change detection provided in an embodiment of the present application, and in particular relates to a possible process for generating a target feature map, where the method may include the following steps:

and step 320, inputting the image to be processed into a feature extraction network for feature extraction, and generating feature maps with multiple scales.

And 340, inputting the feature maps of multiple scales into the feature map pyramid network, and fusing the feature maps of multiple scales to generate a target feature map.

The feature extraction network is used for extracting features of the image to be processed, and the feature extraction network can be composed of a basic unet network or an improved unet network. Taking the elementary unet network as an example, as shown in fig. 4, fig. 4 is a schematic structural diagram of the elementary unet network, and the elementary unet network is in a "U" shape and is therefore called unet, wherein the elementary unet network may include an encoding network and a decoding network.

In fig. 4, each bar frame refers to a feature map, and in the feature extraction path of the coding network, a plurality of convolution and pooling operations are included, from top to bottom, to generate feature maps of different sizes from a low dimension to a high dimension, and from top to bottom, 5 layers of feature maps with dimension sizes are total, 3 × 3 convolution layers are arranged between the laterally adjacent feature maps for extracting features, and 2 × 2 pooling layers are arranged between the upper feature map and the lower feature map for reducing the dimensions; deconvolution up-sampling is carried out in a decoding network, 3-by-3 convolution layers are arranged between transverse adjacent feature graphs and are used for extracting features, up-sampling is carried out between feature graphs from bottom to top and is used for recovering dimensionality, and the size of the feature graphs is expanded to be twice of that of the feature graphs at the previous stage every time the up-sampling is carried out.

The image to be processed is input into a coding network and a decoding network of a feature extraction network for feature extraction, finally, feature maps of multiple scales can be generated, the obtained feature maps of multiple scales are input into a feature map pyramid network, and the feature maps of multiple scales are fused to generate a target feature map. Specifically, when feature fusion is performed, as shown in fig. 5, fig. 5 is a schematic diagram of performing feature fusion on an FPN network, where the FPN network transfers deep image feature information from top to bottom to a lower layer in a horizontal connection and upsampling manner, specifically, performs 2-fold upsampling on features of a higher feature layer, that is, inserts new pixels between pixel points by using a suitable interpolation algorithm on the basis of original image pixels, so as to enlarge the size of the feature map, changes the number of channels of the lower layer features by 1 × 1 convolution, and then adds corresponding pixels of the upsampled and 1 × 1 convolved results to obtain a fused target feature map.

In this embodiment, an image to be processed is input into a feature extraction network for feature extraction, so as to generate feature maps of multiple scales, the feature maps of multiple scales are input into a feature map pyramid network, and the feature maps of multiple scales are fused, so as to generate a target feature map. The feature graphs of multiple scales are extracted through the feature extraction network, so that the feature graph pyramid network can conveniently fuse the feature graphs of multiple scales.

In an embodiment, the feature extraction network includes an encoding network and a decoding network, the encoding network employs a residual error network, as shown in fig. 6, which shows a flowchart of image change detection provided in an embodiment of the present application, and in particular relates to a possible process for generating multiple scale feature maps, and the method may include the following steps:

and step 620, inputting the image to be processed into a residual error network for processing, and generating first feature maps with multiple scales.

And step 640, inputting the first feature maps of multiple scales into a decoding network for processing, and generating second feature maps of multiple scales.

The encoding network may include a plurality of encoding modules, and the decoding network may include a plurality of decoding modules. The Residual network may be a Resnet34 Residual network, and fig. 7 is a schematic structural diagram of a Resnet34 Residual network, where the Resnet Residual network is a network constructed by a Residual block (Residual block), the Residual block includes a direct mapping portion and a Residual portion, the Residual portion generally includes two or three convolution operations, and fig. 7 is a structural diagram including two convolution operations. In the two convolution processes, a Batch Normalization layer (BN) can be added to normalize the data, so that the convergence speed of the network can be increased, and the robustness of the network can be improved.

The image to be processed can be input into a residual error network for processing to generate first feature maps with multiple scales, wherein the first feature maps are obtained by performing convolution and pooling operations for multiple times through the residual error network. The decoding network mainly carries out deconvolution operation, and the feature graph of the coding network is connected with the feature graph passing through the decoding network in a jump way, namely the input of each decoding module is fused with the output of the coding block of the corresponding layer to be used as the input of the next deconvolution layer, so that the first feature graphs of a plurality of scales are input into the decoding network for processing, and the second feature graphs of a plurality of scales are generated.

Alternatively, the coding network may comprise four coding modules, respectively encodes₁、encode₂、encode₃、encode₄Respectively processed by the four coding modules to generate first feature maps with a plurality of scales, namely the feature maps f of different levels₁、f₂、f3,、f₄Each decoding module reduces the scale of the feature map by half and doubles the dimension, wherein the convolution operation in the coding module is used for extracting feature information, and the pooling layer is used for filtering some unimportant high-frequency information; the decoding network may also include four decoding modules, each decoder module₁、decoder₂、decoder₃、decoder₄The four decoding modules are used for processing, each decoding module doubles the dimension of the characteristic diagram and reduces the dimension by half, the input of each decoding module is fused with the output of the coding block of the corresponding layer to be used as the input of the next deconvolution layer, and the information loss caused by the pooling layer in the coding module is reduced through jumping connection.

In this embodiment, an image to be processed is input to a residual error network for processing, so as to generate first feature maps of multiple scales, and the first feature maps of multiple scales are input to a decoding network for processing, so as to generate second feature maps of multiple scales. Because the residual error network is used as the feature extraction layer of the coding network, the optimization is easier, the accuracy of feature extraction can be improved by increasing equivalent depth, and the gradient disappearance problem caused by increasing depth in the deep neural network is relieved by the internal residual error block; and the coding network and the decoding network are connected through jump connection, so that the decoding network can restore the detail information of the target more accurately.

In one embodiment, as shown in fig. 8, which illustrates a flowchart of image change detection provided in an embodiment of the present application, and particularly relates to a possible process for determining an image change detection result, the method may include the following steps:

and 820, inputting the target feature map into a preset classifier for classification, and generating a category corresponding to each pixel point in the image to be processed and the probability of the pixel point under the category.

And 840, aiming at each pixel point in the image to be processed, generating an image change detection result between the first image and the second image according to the category corresponding to each pixel point and the probability of the pixel point under the category.

The FPN network fuses and upsamples feature maps of different scales to a target feature map with the same size and size as an input image, and the target feature map can be transmitted to a preset classifier, the preset classifier can be a softmax classifier, the preset classifier is used for calculating the probability of each pixel point on the input target feature map appearing under each category aiming at each category, and the sum of the probabilities under different categories aiming at the same pixel point is 1. The target feature map is input into a preset classifier to be classified, categories corresponding to all pixel points in the image to be processed and the probability of the pixel points under the categories are generated, all the pixel points belonging to the same category are obtained according to the categories corresponding to the pixel points and the probability of the pixel points under the categories aiming at all the pixel points in the image to be processed, and then the image change detection result between the first image and the second image can be generated according to all the pixel points under all the categories.

In this embodiment, the target feature map is input into the preset classifier to be classified, a category corresponding to each pixel point in the image to be processed and a probability of the pixel point under the category are generated, for each pixel point in the image to be processed, an image change detection result between the first image and the second image is generated according to the category corresponding to each pixel point and the probability of the pixel point under the category, and the target feature map is processed by the preset classifier, so that the image change detection result can be obtained more efficiently and accurately.

In one embodiment, as shown in fig. 9, which illustrates a flowchart of image change detection provided in an embodiment of the present application, and particularly relates to a possible process for optimizing an image change detection result, the method may include the following steps:

and 920, determining pixel points belonging to the variation class from the pixel points according to the categories corresponding to the pixel points and the probability of the pixel points under the categories.

And 940, denoising the pixel points belonging to the change class to generate target pixel points belonging to the change class.

Step 960, generating an image change detection result between the first image and the second image according to the target pixel points belonging to the change class.

The categories corresponding to the pixel points can include a change category and a non-change category, the probability of each pixel point under different categories can be calculated in a preset classifier, and whether each pixel point belongs to the change category or the non-change category can be determined by comparing the magnitude of the probability values, for example, the probability that one pixel point belongs to the change category is 0.8, the probability that one pixel point belongs to the non-change category is 0.2, and then the pixel point is determined to belong to the change category by comparing the magnitudes of the two probability values, so that all the image points belonging to the change category can be finally obtained according to the mode.

And then denoising the pixels belonging to the change class to generate target pixels belonging to the change class, wherein the denoising process is to optimize the condition of misclassification after passing through a preset classifier, specifically, denoising can be performed according to the distribution quantity of the pixels belonging to the change class, for example, the quantity of the pixels belonging to the change class in a region with the preset size of 1 × 1 is less than a preset value, so that the pixels belonging to the change class in the region are removed, the pixels in the region can be used as the pixels of the non-change class, and finally, the target pixels belonging to the change class can be generated after denoising is performed. And extracting a connected domain from the target pixel points belonging to the change class, wherein the connected domain can be directly used as a change region in the image change detection result, or a circumscribed rectangle of the connected domain can be used as the change region in the image change detection result, and the process of extracting the connected domain is well known by those skilled in the art and is not described herein again.

In the embodiment, the pixel points belonging to the variation class are determined from the pixel points according to the class corresponding to each pixel point and the probability of the pixel points under the class; denoising the pixel points belonging to the change class to generate target pixel points belonging to the change class; and generating an image change detection result between the first image and the second image according to the target pixel points belonging to the change class. The pixel points belonging to the change class are subjected to denoising processing, so that more accurate pixel points belonging to the change class can be extracted, and the accuracy of image change detection is improved.

In one embodiment, as shown in fig. 10, which illustrates a flowchart of image change detection provided by an embodiment of the present application, and particularly relates to a possible process of generating an input image, the method may include the following steps:

step 1020, performing alignment processing on the first image and the second image to generate an aligned first image and an aligned second image.

And step 1040, performing difference processing on the aligned first image and the aligned second image to generate a difference image.

And step 1060, inputting the difference image into a preset image change detection network for detection, and generating a target characteristic diagram.

Firstly, using a preset algorithm to simultaneously detect key points of the reference image and the image to be detected and extract feature descriptors of the key points, wherein the preset algorithm can be an algorithm of sift, surf, orb and the like in the traditional method or a D2-Net algorithm in a deep learning method; then, matching key points, for example, a KNN algorithm can be adopted to match the key points in the two images, and a RANSAC algorithm is used to screen the matched key points, which are called matching points for short, and the RANSAC algorithm is an algorithm for calculating mathematical model parameters of data according to a group of sample data sets containing abnormal data to obtain effective sample data; and finally, calculating a change matrix according to the matching points, and aligning the image to be detected and the reference image based on the change matrix, so as to obtain the image to be detected aligned with the reference image, namely, in the aligned first image and the aligned second image, the reference image is kept unchanged, and the image to be detected is the aligned image.

After the alignment process, the two aligned images may be processed to have the same size, for example, the size may be converted to 608 × 608; for different scenes, for example, in the case that the resolution of the picture is large and the size of the object in the picture is normal, that is, the size of the aligned image is much larger than the size of 608x608, the aligned image should not be directly transformed to the size of 608x608, at this time, the picture may be divided into a plurality of sub-regions with the size of 608x608, or the divided sub-region image may be transformed to the size of 608x608, and then the sub-region image is subjected to subsequent processing.

Performing difference processing on the aligned first image and the aligned second image to generate a difference image, and performing difference processing on the aligned first image and the aligned second image by using an image difference algorithm to generate a difference image; the difference processing may also be performed on the aligned first image and the aligned second image by using an image ratio algorithm to generate a difference image, and the embodiment does not specifically limit the manner of generating the difference image. And finally, the difference image can be directly input into a preset image change detection network for detection to generate a target characteristic diagram.

In this embodiment, the first image and the second image after alignment are generated by performing alignment processing on the first image and the second image; performing difference processing on the aligned first image and the aligned second image to generate a difference image; and inputting the difference image into a preset image change detection network for detection to generate a target characteristic diagram. After the first image and the second image are aligned, the two images can be conveniently compared subsequently, and the efficiency and the accuracy of image change detection are improved.

In one embodiment, as shown in fig. 11, which illustrates a flowchart of image change detection provided in an embodiment of the present application, specifically, a possible process for generating a preset image change detection network is provided, where the method may include the following steps:

step 1120, acquiring a training set; the training set comprises a first image, a second image, and image change detection labeling results between the first image and the second image corresponding to the same scene at different moments.

Step 1140, the first image and the second image are input to an initial image change detection network for training to generate an image change detection prediction result.

Step 1160, calculating a loss function value according to the image change detection prediction result and the image change detection labeling result; the loss function comprises a joint loss function comprising a cross entropy loss function and a similarity measure function.

And 1180, adjusting parameters of the initial image change detection network according to the value of the loss function, and generating a preset image change detection network.

The initial image change detection network is obtained after network parameters are initialized, a first image and a second image are input into the initial image change detection network, the initial image change detection network can output an image change detection prediction result after learning image features of the first image and the second image, the image change detection prediction result and an image change detection labeling result are substituted into a preset loss function to calculate a value of the loss function, parameters of the initial image change detection network are adjusted according to the value of the loss function, and therefore the network parameters which enable the value of the loss function to be minimized serve as optimal network parameters, and the preset image change detection network is generated according to the optimal network parameters.

The preset loss function comprises a joint loss function, the joint loss function comprises a cross entropy loss function and a similarity measurement function, the cross entropy loss function is a classical two-classification cross entropy loss function and can be specifically expressed as a formula (1), the similarity measurement function is a Dice function, the Dice function is used for determining whether pixel points are correctly classified or not and can be expressed as a formula (2), and the joint loss function can be expressed by a formula (3).

L＝L_cross+L_Dice (3)

Wherein, t_nThe real label type is represented, when n is 0, the non-change type is represented, and when n is 1, the change type is represented; p_nRepresenting a probability value that the prediction n is a change class; n is the total number of pixels in a sample; n represents a pixel in the sample; y is_nThe prediction classes are indicated.

In addition, the parameters of the model need to be updated by an optimizer in the network training process, so that the loss function is reduced to the minimum, and the optimizer can adopt the optimizers such as SGD (generalized minimum delay), BGD (BGD), Adam and the like. Taking Adam optimizer as an example, Adam optimizer is an algorithm for performing first-order gradient optimization on a random objective function, and the algorithm is based on adaptive low-order moment estimation, and in order to prevent the model from falling into a local optimal solution and enable the model to be converged quickly, a simulated annealing algorithm can be introduced to adjust the learning rate. Further, since the classification result of each pixel affects the final result, in order to judge the accuracy of the image change detection network, F1score may be used as an evaluation index of the network. Specifically, F1score is a harmonic mean of recall and precision, the higher the F1score, the better the performance of the algorithm, and F1score can be expressed as shown in equation (4).

Wherein TP represents the correct predicted answer; FP indicates that other classes are predicted as the current class by mistake; FN indicates that the class label predicts as other class labels.

In the embodiment, a first image and a second image are input into an initial image change detection network for training to generate an image change detection prediction result, and a value of a joint loss function is calculated according to the image change detection prediction result and an image change detection labeling result; and adjusting parameters of the initial image change detection network according to the value of the loss function to generate a preset image change detection network. Due to the fact that the problem of unbalanced sample distribution exists in image change detection, the fact that characteristic learning of a more flat unchanged portion is emphasized by a network is caused only by a classical two-classification cross entropy loss function, a Dice function which only concerns whether pixel points are correctly classified is added to be used as supplement of the cross entropy loss function, accordingly, the influence of class unbalance on network accuracy is reduced, and accuracy of network training is improved.

In an embodiment, as shown in fig. 12, the overall architecture diagram for image change detection provided in the embodiment of the present application is shown, specifically, a difference image is obtained by performing a difference between a reference image and an image to be detected, the difference image is input into a preset image change detection network for detection, feature maps of multiple scales are fused by combining a feature map pyramid network, so as to output the feature maps, the output feature maps are input into a preset softmax classifier, probabilities of pixel points on the feature maps in different classes are calculated, so as to output a probability distribution map, and finally, a back propagation algorithm is used for optimizing network parameters for parameters of the image change detection network according to a joint loss function.

It should be understood that, although the steps in the flowcharts related to the embodiments as described above are sequentially displayed as indicated by arrows, the steps are not necessarily performed sequentially as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least a part of the steps in the flowcharts related to the embodiments described above may include multiple steps or multiple stages, which are not necessarily performed at the same time, but may be performed at different times, and the execution order of the steps or stages is not necessarily sequential, but may be rotated or alternated with other steps or at least a part of the steps or stages in other steps.

Based on the same inventive concept, the embodiment of the present application further provides an image change detection apparatus for implementing the image change detection method. The implementation scheme for solving the problem provided by the apparatus is similar to the implementation scheme described in the above method, so specific limitations in one or more embodiments of the image change detection apparatus provided below can be referred to the limitations of the image change detection method in the foregoing, and details are not described herein again.

In one embodiment, as shown in fig. 13, there is provided an image change detection apparatus 1300 including: a first obtaining module 1302, a first generating module 1304, and a determining module 1306, wherein:

a first obtaining module 1302, configured to obtain an image to be processed; the image to be processed comprises a first image and a second image which correspond to the same scene at different moments.

A first generating module 1304, configured to input the image to be processed into a preset image change detection network for detection, so as to generate a target feature map; the preset image change detection network comprises a feature map pyramid network, and the feature map pyramid network is used for fusing feature maps of multiple scales of the image to be processed to generate a target feature map.

The determining module 1306 is configured to determine an image change detection result between the first image and the second image according to the target feature map.

In one embodiment, the preset image change detection network further comprises a feature extraction network; the first generating module 1304 is specifically configured to input the image to be processed into a feature extraction network for feature extraction, so as to generate feature maps of multiple scales; and inputting the feature maps of multiple scales into the feature map pyramid network, and fusing the feature maps of multiple scales to generate a target feature map.

In one embodiment, the feature extraction network comprises an encoding network and a decoding network, wherein the encoding network adopts a residual error network; the first generating module 1304 is further configured to input the image to be processed into a residual error network for processing, so as to generate a first feature map with multiple scales; and inputting the first feature maps of multiple scales into a decoding network for processing to generate second feature maps of multiple scales.

In an embodiment, the determining module 1306 is specifically configured to input the target feature map into a preset classifier for classification, and generate a category corresponding to each pixel point in the image to be processed and a probability of the pixel point under the category; and generating an image change detection result between the first image and the second image according to the category corresponding to each pixel point and the probability of the pixel point under the category for each pixel point in the image to be processed.

In one embodiment, the categories include a change class and a non-change class; the determining module 1306 is further configured to determine, according to the category corresponding to each pixel point and the probability of the pixel point under the category, a pixel point belonging to a variation class from each pixel point; denoising the pixel points belonging to the change class to generate target pixel points belonging to the change class; and generating an image change detection result between the first image and the second image according to the target pixel points belonging to the change class.

In one embodiment, the first generating module 1304 is specifically configured to perform an alignment process on a first image and a second image to generate an aligned first image and an aligned second image; performing difference processing on the aligned first image and the aligned second image to generate a difference image; and inputting the difference image into a preset image change detection network for detection to generate a target characteristic diagram.

In one embodiment, there is provided a training apparatus for an image change detection network, including: the device comprises a second acquisition module, an input module, a calculation module and a second generation module, wherein:

the second acquisition module is used for acquiring a training set; the training set comprises a first image, a second image, and image change detection labeling results between the first image and the second image corresponding to the same scene at different moments.

And the input module is used for inputting the first image and the second image into the initial image change detection network for training to generate an image change detection prediction result.

The calculation module is used for calculating the value of the loss function according to the image change detection prediction result and the image change detection annotation result; the loss function comprises a joint loss function comprising a cross entropy loss function and a similarity measure function.

And the second generation module is used for adjusting the parameters of the initial image change detection network according to the value of the loss function and generating a preset image change detection network.

The modules in the image change detection device and the training device of the image change detection network can be wholly or partially realized by software, hardware and a combination thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.

In one embodiment, a computer device is provided, comprising a memory and a processor, the memory having a computer program stored therein, the processor implementing the following steps when executing the computer program:

In one embodiment, the preset image change detection network further comprises a feature extraction network;

the processor, when executing the computer program, further performs the steps of:

In one embodiment, the feature extraction network comprises an encoding network and a decoding network, wherein the encoding network adopts a residual error network;

In one embodiment, the processor, when executing the computer program, further performs the steps of:

In one embodiment, the categories include a change class and a non-change class;

The implementation principle and technical effect of the computer device provided by the embodiment of the present application are similar to those of the method embodiment described above, and are not described herein again.

In one embodiment, a computer-readable storage medium is provided, having a computer program stored thereon, which when executed by a processor, performs the steps of:

the computer program when executed by the processor further realizes the steps of:

In one embodiment, the computer program when executed by the processor further performs the steps of:

The implementation principle and technical effect of the computer-readable storage medium provided by this embodiment are similar to those of the above-described method embodiment, and are not described herein again.

In one embodiment, a computer program product is provided, comprising a computer program which, when executed by a processor, performs the steps of:

The computer program product provided in this embodiment has similar implementation principles and technical effects to those of the method embodiments described above, and is not described herein again.

It should be noted that, the user information (including but not limited to user device information, user personal information, etc.) and data (including but not limited to data for analysis, stored data, presented data, etc.) referred to in the present application are information and data authorized by the user or sufficiently authorized by each party.

It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, database, or other medium used in the embodiments provided herein may include at least one of non-volatile and volatile memory. The nonvolatile Memory may include Read-Only Memory (ROM), magnetic tape, floppy disk, flash Memory, optical Memory, high-density embedded nonvolatile Memory, resistive Random Access Memory (ReRAM), Magnetic Random Access Memory (MRAM), Ferroelectric Random Access Memory (FRAM), Phase Change Memory (PCM), graphene Memory, and the like. Volatile Memory can include Random Access Memory (RAM), external cache Memory, and the like. By way of illustration and not limitation, RAM can take many forms, such as Static Random Access Memory (SRAM) or Dynamic Random Access Memory (DRAM), among others. The databases referred to in various embodiments provided herein may include at least one of relational and non-relational databases. The non-relational database may include, but is not limited to, a block chain based distributed database, and the like. The processors referred to in the embodiments provided herein may be general purpose processors, central processing units, graphics processors, digital signal processors, programmable logic devices, quantum computing based data processing logic devices, etc., without limitation.

The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.

The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the present application. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present application shall be subject to the appended claims.

Claims

1. An image change detection method, characterized in that the method comprises:

acquiring an image to be processed; the image to be processed comprises a first image and a second image which correspond to the same scene at different moments;

inputting the image to be processed into a preset image change detection network for detection to generate a target characteristic diagram; the preset image change detection network comprises a feature map pyramid network, and the feature map pyramid network is used for fusing feature maps of multiple scales of the image to be processed to generate the target feature map;

and determining an image change detection result between the first image and the second image according to the target feature map.

2. The method of claim 1, wherein the pre-set image change detection network further comprises a feature extraction network; the inputting the image to be processed into a preset image change detection network for detection to generate a target characteristic diagram includes:

inputting the image to be processed into the feature extraction network for feature extraction to generate feature maps with multiple scales;

and inputting the feature maps of multiple scales into the feature map pyramid network, and fusing the feature maps of multiple scales to generate the target feature map.

3. The method of claim 2, wherein the feature extraction network comprises an encoding network and a decoding network, the encoding network employs a residual network; inputting the image to be processed into the feature extraction network for feature extraction, and generating feature maps with multiple scales, wherein the feature maps comprise:

inputting the image to be processed into the residual error network for processing to generate first feature maps with multiple scales;

and inputting the first feature maps of the multiple scales into the decoding network for processing to generate second feature maps of the multiple scales.

4. The method of claim 1, wherein determining the image change detection result between the first image and the second image according to the target feature map comprises:

inputting the target feature map into a preset classifier for classification, and generating a category corresponding to each pixel point in the image to be processed and the probability of the pixel point under the category;

and aiming at each pixel point in the image to be processed, generating an image change detection result between the first image and the second image according to the category corresponding to each pixel point and the probability of the pixel point under the category.

5. The method of claim 4, wherein the categories include a change class and a non-change class; the generating an image change detection result between the first image and the second image according to the category corresponding to each pixel point and the probability of the pixel point under the category includes:

determining pixel points belonging to a change class from the pixel points according to the class corresponding to the pixel points and the probability of the pixel points under the class;

denoising the pixel points belonging to the change class to generate target pixel points belonging to the change class;

and generating an image change detection result between the first image and the second image according to the target pixel points belonging to the change class.

6. The method according to claim 1, wherein the inputting the image to be processed into a preset image change detection network for detection to generate a target feature map, further comprises:

aligning the first image and the second image to generate an aligned first image and an aligned second image;

performing difference processing on the aligned first image and the aligned second image to generate a difference image;

and inputting the difference image into the preset image change detection network for detection to generate the target characteristic diagram.

7. A method for training an image change detection network, the method comprising:

acquiring a training set; the training set comprises a first image, a second image, and image change detection labeling results between the first image and the second image which correspond to the same scene at different moments;

inputting the first image and the second image into an initial image change detection network for training to generate an image change detection prediction result;

calculating the value of a loss function according to the image change detection prediction result and the image change detection annotation result; the loss function comprises a joint loss function which comprises a cross entropy loss function and a similarity measurement function;

and adjusting parameters of the initial image change detection network according to the value of the loss function to generate the preset image change detection network.

8. An image change detection apparatus, characterized in that the apparatus comprises:

the first generation module is used for inputting the image to be processed into a preset image change detection network for detection to generate a target characteristic diagram; the preset image change detection network comprises a feature map pyramid network, and the feature map pyramid network is used for fusing feature maps of multiple scales of the image to be processed to generate the target feature map;

and the determining module is used for determining an image change detection result between the first image and the second image according to the target feature map.

9. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor, when executing the computer program, implements the steps of the method of any of claims 1 to 7.

10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 7.