CN117474885A

CN117474885A - Remote sensing image parallax change processing method for unmanned aerial vehicle garbage scattering area detection

Info

Publication number: CN117474885A
Application number: CN202311477869.0A
Authority: CN
Inventors: 杨扬; 董亚鑫; 赵姗; 白海城; 邢林
Original assignee: Yunnan Normal University
Current assignee: Yunnan Normal University
Priority date: 2023-11-08
Filing date: 2023-11-08
Publication date: 2024-01-30

Abstract

The invention relates to the technical field of computer vision, in particular to a remote sensing image parallax change processing method, device and storage medium for detecting a garbage scattering area of an unmanned aerial vehicle. The method comprises the following steps: based on a feature pyramid module, downsampling at least two unmanned aerial vehicle remote sensing images acquired by an unmanned aerial vehicle from a garbage scattering area to obtain a multi-level feature atlas corresponding to each unmanned aerial vehicle remote sensing image; based on an optical flow alignment module, performing feature alignment mapping on the multi-level feature image sets corresponding to the two matched unmanned aerial vehicle remote sensing images to obtain a pair of unmanned aerial vehicle remote sensing image combinations with the aligned features; based on a progressive difference feature fusion and detection module, the unmanned aerial vehicle remote sensing image combination is processed as feature difference mapping so as to eliminate repeated features in the unmanned aerial vehicle remote sensing image combination, and a multi-level feature difference map subjected to parallax change processing is obtained. The method aims at solving the problem of how to perform parallax change processing on the unmanned aerial vehicle remote sensing image with the visual angle difference.

Description

Remote sensing image parallax change processing method for unmanned aerial vehicle garbage scattering area detection

Technical Field

The invention relates to the technical field of computer vision, in particular to a remote sensing image parallax change processing method for detecting a garbage scattering area of an unmanned aerial vehicle.

Background

The small unmanned aerial vehicle plays an important role in earth remote sensing observation and has the advantages of cooperative technology, easiness in operation, high-resolution image acquisition, flexibility and the like. When the small unmanned aerial vehicle captures images at the same location but at different times, the acquired images inevitably have a viewing angle difference due to factors such as inaccurate GPS (Global Positioning System) signals, changes in flight attitude, complex wind speeds and directions, geographical environments, and the like.

For solving the problem of detecting the change of the image with the visual angle difference, in the related technical scheme, a weak supervision semantic scene change detection model is adopted to detect the change of the image with the visual angle difference. Dividing an image into grid cells, extracting characteristics of the grid cells by using a Convolutional Neural Network (CNN), calculating Euclidean distances among the characteristics of the grid cells, and judging whether the image in a grid area is changed or not according to the Euclidean distances.

However, the inventors have found in the course of conception and implementation of the present application that: when the unmanned aerial vehicle is applied to garbage scattering area detection and the conventional parallax change detection model is adopted for change detection, the image change with the visual angle difference can be accurately detected only when the corresponding pixels are divided into grid cells at the same position. In addition, since the capability of processing an image having a viewing angle difference depends on the size of the grid cell, when the size of the grid cell is increased in order to process an image having a large viewing angle difference, there is a case where a change detection area introduces an error, and therefore, the method is difficult to be applied to the image change processing having a large viewing angle difference, and has a defect of insufficient applicability.

The foregoing is provided merely for the purpose of facilitating understanding of the technical solutions of the present invention and is not intended to represent an admission that the foregoing is prior art.

Disclosure of Invention

The invention mainly aims to provide a remote sensing image parallax change processing method for detecting a garbage scattering area of an unmanned aerial vehicle, and aims to solve the problem of how to perform parallax change processing on remote sensing images of the unmanned aerial vehicle with visual angle differences.

In order to achieve the above object, the present invention provides a method for processing parallax change of a remote sensing image for detecting a garbage scattering area of an unmanned aerial vehicle, the method comprising:

based on a feature pyramid module, downsampling at least two unmanned aerial vehicle remote sensing images acquired by an unmanned aerial vehicle from a garbage scattering area to obtain a multi-level feature atlas corresponding to each unmanned aerial vehicle remote sensing image;

based on an optical flow alignment module, performing feature alignment mapping on the multi-level feature atlas corresponding to the two matched unmanned aerial vehicle remote sensing images to obtain a pair of unmanned aerial vehicle remote sensing image combinations with aligned features, wherein the optical flow alignment module comprises an optical flow estimator for optical flow estimation and a warping layer for performing feature alignment between the multi-level feature atlas corresponding to the two matched unmanned aerial vehicle remote sensing images;

Based on a progressive difference feature fusion and detection module, the unmanned aerial vehicle remote sensing image combination is processed as feature difference mapping so as to eliminate repeated features in the unmanned aerial vehicle remote sensing image combination, and a multi-level feature difference map subjected to parallax change processing is obtained.

Optionally, the feature pyramid module is built based on a VGG16 network module, and the step of downsampling at least two unmanned aerial vehicle remote sensing images acquired by the unmanned aerial vehicle from the garbage scattering area based on the feature pyramid module to obtain a multi-level feature atlas corresponding to each unmanned aerial vehicle remote sensing image includes:

performing feature extraction on the unmanned aerial vehicle remote sensing image to obtain a plurality of feature images with different layers, and forming the multi-level feature atlas corresponding to the unmanned aerial vehicle remote sensing image;

the number of channels of the feature map located at the next layer is twice the number of channels of the feature map located at the adjacent upper layer.

Optionally, the two matched unmanned aerial vehicle remote sensing images include a first unmanned aerial vehicle remote sensing image and a second unmanned aerial vehicle remote sensing image, the multi-level feature atlas includes a first multi-level feature atlas corresponding to the first unmanned aerial vehicle remote sensing image, and a second multi-level feature atlas corresponding to the second unmanned aerial vehicle remote sensing image, the step of obtaining the unmanned aerial vehicle remote sensing image combination after feature alignment includes:

Calculating a local correlation map and a global correlation map between the first multi-level feature atlas and the second multi-level feature atlas;

invoking an optical flow estimator;

determining a global optical flow estimation result of a lowest resolution feature map in the multi-level feature map set based on the global correlation map and the optical flow estimator, and determining a local optical flow estimation result of other feature maps except the lowest resolution feature map in the multi-level feature map set based on the local correlation map and the optical flow estimator, wherein the multi-level feature map set comprises the first multi-level feature map set and the second multi-level feature map set;

according to the global optical flow estimation result and the local optical flow estimation result, moving the pixel points in the first multi-level feature image set and the second multi-level feature image set through the warping layer to obtain a coarse alignment first unmanned aerial vehicle remote sensing image and a coarse alignment second unmanned aerial vehicle remote sensing image;

and determining the coarse alignment first unmanned aerial vehicle remote sensing image and the coarse alignment second unmanned aerial vehicle remote sensing image as the unmanned aerial vehicle remote sensing image combination.

Optionally, the warp layer includes a first type warp layer and a second type warp layer, and the step of obtaining a coarse alignment first multi-level feature map set and a coarse alignment second multi-level feature map set by moving pixels in the first multi-level feature map set and the second multi-level feature map set through the warp layer according to the global optical flow estimation result and the local optical flow estimation result includes:

Twisting the feature images of the second multi-level feature atlas through the first warp layer so as to align the feature images of the second multi-level feature atlas with the first multi-level feature atlas, thereby obtaining the coarse alignment first multi-level feature atlas and the coarse alignment second multi-level feature atlas; and/or the number of the groups of groups,

using an optical flow estimation result of a previous layer to warp a feature map of a current layer through the second type of warping layer so as to align the feature map of each layer of the second multi-level feature map set with the feature map of a corresponding level in the first multi-level feature map set, thereby obtaining the coarse alignment first multi-level feature map set and the coarse alignment second multi-level feature map set, wherein the optical flow estimation result comprises the global optical flow estimation result and the local optical flow estimation result;

the second type of warping layer is applied to the multi-level feature map set, and other layers of feature maps except the bottommost layer of feature map are used.

Optionally, the unmanned aerial vehicle remote sensing image combination includes a coarse alignment first unmanned aerial vehicle remote sensing image and a coarse alignment second unmanned aerial vehicle remote sensing image, the step of obtaining the multi-level feature difference map through parallax change processing based on progressive difference feature fusion and detection module, wherein the unmanned aerial vehicle remote sensing image combination is processed as feature difference mapping to eliminate repeated features in the unmanned aerial vehicle remote sensing image combination, and the step of obtaining the multi-level feature difference map through parallax change processing includes:

Determining a characteristic map in the coarse alignment second unmanned aerial vehicle remote sensing image, a characteristic difference map between the characteristic map in the coarse alignment first unmanned aerial vehicle remote sensing image, and calculating a difference map absolute value of the characteristic difference map;

performing up-sampling processing on the feature difference image generated in the previous level to obtain an up-sampling feature difference image;

and fusing the absolute value of the difference map and the upsampled characteristic difference map through an attention mechanism and multi-layer convolution processing to obtain the multi-layer characteristic difference map.

Optionally, before the step of obtaining the multi-level feature atlas corresponding to each unmanned aerial vehicle remote sensing image by downsampling at least two unmanned aerial vehicle remote sensing images collected from the garbage scattering area by the unmanned aerial vehicle based on the feature pyramid module, the method further includes:

based on the endpoint error loss function, constraining optical flow error loss between the unmanned aerial vehicle remote sensing image combination after characteristic alignment compared with the unmanned aerial vehicle remote sensing image combination acquired initially;

and constraining characteristic errors between deep features of the multi-level feature difference map and the input unmanned aerial vehicle remote sensing image based on a binary cross entropy loss function.

Acquiring a remote sensing image training data set, and enhancing the remote sensing image training data set through random overturn to obtain a preprocessed remote sensing image training data set;

determining a garbage scattering area in the preprocessed remote sensing image training data set, and performing random affine transformation processing on image pairs in the garbage scattering area to enhance viewpoint differences in the preprocessed remote sensing image training data set;

training a preset neural network in the unmanned aerial vehicle based on the preprocessed remote sensing image training data set after the viewpoint difference is enhanced;

the remote sensing image training data set comprises a change map obtained by synthesizing a garbage distribution image with viewpoint difference and a remote sensing image acquired by an unmanned aerial vehicle, and optical flow information corresponding to the garbage distribution image, the remote sensing image and the change map.

In addition, to achieve the above object, the present invention further provides an end-to-end variation detecting network model, including:

the feature pyramid module is used for downsampling at least two unmanned aerial vehicle remote sensing images acquired by the unmanned aerial vehicle from the garbage scattering area to obtain multi-level feature atlas corresponding to each unmanned aerial vehicle remote sensing image;

The optical flow alignment module is used for carrying out feature alignment mapping on the multi-level feature image sets corresponding to the two matched unmanned aerial vehicle remote sensing images to obtain a pair of unmanned aerial vehicle remote sensing image combinations with the aligned features, wherein the optical flow alignment module comprises an optical flow estimator for optical flow estimation and a warping layer for carrying out feature alignment on the multi-level feature image sets corresponding to the two matched unmanned aerial vehicle remote sensing images;

the progressive difference feature fusion and detection module is used for combining the unmanned aerial vehicle remote sensing images and performing feature difference mapping processing on the combined unmanned aerial vehicle remote sensing images so as to eliminate repeated features in the unmanned aerial vehicle remote sensing images and obtain a multi-level feature difference image subjected to parallax change processing.

In addition, in order to achieve the above object, the present invention also provides an unmanned aerial vehicle parallax change processing system, including: the remote sensing image parallax change processing method comprises the steps of a memory, a processor and a remote sensing image parallax change processing program which is stored in the memory and can be operated on the processor and used for detecting the unmanned aerial vehicle garbage scattering area, wherein the remote sensing image parallax change processing program for detecting the unmanned aerial vehicle garbage scattering area is executed by the processor and is used for realizing the remote sensing image parallax change processing method for detecting the unmanned aerial vehicle garbage scattering area.

In addition, in order to achieve the above object, the present invention also provides a computer-readable storage medium having stored thereon a remote sensing image parallax change processing program for unmanned aerial vehicle garbage scattering area detection, which when executed by a processor, implements the steps of the remote sensing image parallax change processing method for unmanned aerial vehicle garbage scattering area detection as described above.

The embodiment of the invention provides a remote sensing image parallax change processing method, device and storage medium for unmanned aerial vehicle garbage scattering area detection, which aim at a characteristic image by providing an end-to-end change detection network model and adopting an optical flow estimation mode, and simultaneously execute change detection, so that a network can directly process images with viewpoint differences, the performance of an unmanned aerial vehicle system on a scene with accumulated errors or large parallax change of the remote sensing image is effectively improved, and the throughput is improved by eliminating repeated characteristic extraction.

Drawings

Fig. 1 is a schematic architecture diagram of a hardware operating environment of an unmanned aerial vehicle parallax change processing system according to an embodiment of the present invention;

fig. 2 is a flowchart of a first embodiment of a remote sensing image parallax change processing method for detecting a garbage scattering area of an unmanned aerial vehicle according to the present invention;

Fig. 3 is a schematic flow chart of a second embodiment of a remote sensing image parallax change processing method for detecting a garbage scattering area of an unmanned aerial vehicle according to the present invention;

fig. 4 is a schematic diagram of another flow involved in a second embodiment of a remote sensing image parallax change processing method for detecting a garbage scattering area of an unmanned aerial vehicle according to the present invention;

fig. 5 is a schematic flow chart of a third embodiment of a remote sensing image parallax change processing method for detecting a garbage scattering area of an unmanned aerial vehicle according to the present invention;

fig. 6 is a schematic diagram of an end-to-end variation detection network model related to a remote sensing image parallax variation processing method for detecting a garbage scattering area of an unmanned aerial vehicle.

The achievement of the objects, functional features and advantages of the present invention will be further described with reference to the accompanying drawings, in conjunction with the embodiments.

Detailed Description

According to the method, the end-to-end change detection network model which is mounted on the unmanned aerial vehicle parallax change processing system is provided, at least two unmanned aerial vehicle remote sensing images with visual angle differences are preprocessed through the feature pyramid module in the end-to-end change detection network model, and the visual angle difference processing capacity of the network model on the unmanned aerial vehicle remote sensing images is improved; and then, aligning the feature images in an optical flow estimation mode, accurately detecting whether the visual angle difference exists between the remote sensing images of the unmanned aerial vehicles even if the corresponding pixels among the remote sensing images of the unmanned aerial vehicles are not divided into the same positions, and when the visual angle difference is detected, processing the images with the visual point difference through a progressive difference feature fusion and detection module, improving the performance of an unmanned aerial vehicle system on a scene with the accumulated error or large parallax change of the remote sensing images, improving the throughput by eliminating repeated feature extraction, avoiding the introduction of errors in a change detection area, and simultaneously getting rid of the dependence on the size of a grid unit of the model in a traditional visual difference change detection model, thereby realizing the image change processing with larger visual angle difference.

In order to better understand the above technical solution, exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.

As an implementation scheme, fig. 1 is a schematic architecture diagram of a hardware running environment of an unmanned aerial vehicle parallax change processing system according to an embodiment of the present invention.

As shown in fig. 1, the unmanned aerial vehicle parallax change processing system may include: a processor 1001, such as a CPU, memory 1005, user interface 1003, network interface 1004, communication bus 1002. Wherein the communication bus 1002 is used to enable connected communication between these components. The user interface 1003 may include a Display, an input unit such as a Keyboard (Keyboard), and the optional user interface 1003 may further include a standard wired interface, a wireless interface. The network interface 1004 may optionally include a standard wired interface, a wireless interface (e.g., WI-FI interface). The memory 1005 may be a high-speed RAM memory or a stable memory (non-volatile memory), such as a disk memory. The memory 1005 may also optionally be a storage device separate from the processor 1001 described above.

Those skilled in the art will appreciate that the unmanned aerial vehicle parallax change processing system architecture shown in fig. 1 is not limiting of the unmanned aerial vehicle parallax change processing system, and may include more or fewer components than illustrated, or may combine certain components, or a different arrangement of components.

As shown in fig. 1, a memory 1005 as a storage medium may include an operating system, a network communication module, a user interface module, and a remote sensing image parallax change processing program for unmanned aerial vehicle garbage distribution area detection. The operation system is a program for managing and controlling hardware and software resources of the unmanned aerial vehicle parallax change processing system, and a remote sensing image parallax change processing program for detecting the unmanned aerial vehicle garbage scattering area and other software or program operations.

In the unmanned plane parallax change processing system shown in fig. 1, the user interface 1003 is mainly used for connecting a terminal, and performs data communication with the terminal; the network interface 1004 is mainly used for a background server and is in data communication with the background server; the processor 1001 may be configured to invoke a remote sensing image parallax change processing program for unmanned aerial vehicle garbage distribution area detection stored in the memory 1005.

In this embodiment, the unmanned aerial vehicle parallax change processing system includes: a memory 1005, a processor 1001, and a remote sensing image parallax change processing program stored on the memory and operable on the processor for unmanned aerial vehicle garbage scattering area detection, wherein:

when the processor 1001 calls a remote sensing image parallax change processing program for unmanned aerial vehicle garbage scattering area detection stored in the memory 1005, it executes the following operations:

invoking an optical flow estimator;

Based on the hardware architecture of the unmanned aerial vehicle parallax change processing system based on the computer vision technology, the embodiment of the remote sensing image parallax change processing method for detecting the unmanned aerial vehicle garbage scattering area is provided.

First embodiment

In this embodiment, the unmanned aerial vehicle parallax change processing system is equipped with the end-to-end change detection network model provided in this embodiment, and the end-to-end change detection network model mainly includes three major modules: the device comprises a feature pyramid module, an optical flow alignment module and a progressive difference feature fusion and detection module.

The feature pyramid module is mainly used for carrying out downsampling on the remote sensing image of the unmanned aerial vehicle, so that the processing speed of the end-to-end change detection network model is improved while a feature map with richer semantic information is obtained, and the subsequent module can be conveniently identified.

The main purpose of the optical flow alignment module is to align the multi-level feature atlas obtained after the feature pyramid module is processed, so that the performance deterioration of the whole unmanned aerial vehicle parallax change processing system caused by accumulated errors in the unmanned aerial vehicle image acquisition process is avoided, and the reduction of the system throughput caused by extracting excessive redundant features is avoided.

Because the image frames in the unmanned aerial vehicle remote sensing image combination obtained after the alignment of the optical flow alignment module are multi-level feature images, feature repeated parts exist among different layers, the progressive difference feature fusion and detection module provided by the embodiment is adopted, the feature repeated parts among the image frames inside the combination are eliminated, the multi-level feature difference images after parallax change processing are generated, and the parallax change processing of the unmanned aerial vehicle remote sensing image is completed.

Referring to fig. 2, in this embodiment, the remote sensing image parallax change processing method for detecting the garbage scattering area of the unmanned aerial vehicle includes the following steps:

step S10, based on a feature pyramid module, downsampling at least two unmanned aerial vehicle remote sensing images acquired by an unmanned aerial vehicle from a garbage scattering area to obtain multi-level feature atlas corresponding to each unmanned aerial vehicle remote sensing image;

in this embodiment, the unmanned aerial vehicle carries out the low altitude flight in the top of rubbish scattering area, sets up the monocular camera on the unmanned aerial vehicle and takes the remote sensing image of below rubbish scattering area, unmanned aerial vehicle remote sensing image promptly, in the unmanned aerial vehicle parallax change processing system of unmanned aerial vehicle built-in the input unmanned aerial vehicle remote sensing image to this embodiment, carries out parallax change processing to unmanned aerial vehicle remote sensing image. Because parallax change occurs based on two image frames continuously acquired in the unmanned aerial vehicle flight process, we need to acquire at least two unmanned aerial vehicle remote sensing images first.

In the step, the feature pyramid module performs downsampling on at least two input unmanned aerial vehicle remote sensing images to obtain a plurality of multi-level feature images corresponding to the unmanned aerial vehicle remote sensing images, and a multi-level feature image set is formed.

As an alternative embodiment, the downsampling is specifically as follows:

firstly, the unmanned aerial vehicle remote sensing image is used for carrying out feature extraction to obtain a plurality of feature images with different layers, and the multi-level feature image set corresponding to the unmanned aerial vehicle remote sensing image is formed. The number of channels of the feature images at the next layer is twice that of the feature images at the adjacent upper layer in the extracted feature images at a plurality of different layers.

Alternatively, the feature pyramid module may be built by using a VGG16 network, that is, a VGG (Visual Geometry Group ) network with 16 convolution layers, formed by alternately stacking 16 convolution layers and a plurality of pooling layers, and finally classifying by using a fully-connected layer.

For example, let the extracted multi-level feature map be 4 layers, and the initial unmanned aerial vehicle remote sensing image contains 512 channels. Firstly, in order to reduce the number of network parameters, the number of channels in an initial unmanned aerial vehicle remote sensing image is adjusted to 256, namely half of the original number, then 4 different layers of feature images are extracted from an input image frame, the feature images encapsulate different semantic information, the sizes of the 4 layers of feature images are respectively 1/2, 1/4, 1/8 and 1/16 of the size of the input image, and the number of channels of each feature image is respectively 64, 128, 256 and 256.

The multi-level feature map extracted by the feature pyramid module contains deep semantic information and shallow and accurate position details, and fusion of the depth and the shallow feature map can contain rich semantic information and accurate position information.

Step S20, performing feature alignment mapping on the multi-level feature atlas corresponding to the two matched unmanned aerial vehicle remote sensing images based on an optical flow alignment module to obtain a pair of unmanned aerial vehicle remote sensing image combinations with aligned features, wherein the optical flow alignment module comprises an optical flow estimator for optical flow estimation and a warping layer for performing feature alignment between the multi-level feature atlas corresponding to the two matched unmanned aerial vehicle remote sensing images;

in this embodiment, after the multi-level feature atlas is obtained, feature pairs Ji Yingshe are performed on the multi-level feature atlas based on the optical flow alignment module. Optical flow is an effective method of establishing dense correspondence between a pair of images, by which alignment of image feature maps is achieved.

The optical flow alignment module comprises an optical flow estimator for optical flow estimation and a warping layer for performing feature alignment between multi-level feature atlas corresponding to the two matched unmanned aerial vehicle remote sensing images.

The optical flow estimator consists of a multi-layer convolutional neural network for estimating pixel displacement between successive frames with the objective of finding a displacement vector for each pixel point in adjacent image frames to describe the direction and speed of movement of an object in the image. In this embodiment, the number of channels in the input is gradually reduced, and optical flow information with two channels is finally generated, wherein the two channels respectively represent the horizontal displacement and the vertical displacement of the pixel in coordinates. Alternatively, the optical flow estimator may consist of 6 convolution layers for the optical flow calculation design, all using convolution kernels of 3x3 size.

The warping layer refers to a layer used for executing warping operation (warping) in the end-to-end variation detection network model, and is responsible for deforming the feature images of two matched images so as to match an image B serving as a matched object with an image A serving as a matched object in the two images, thereby realizing the alignment between the two feature images.

Illustratively, the warp layer is defined as follows:

F _bω (x)＝F _b (x+of(x))

where x is F of image B as the matched object _bω Of (x) is the result of optical flow estimation.

Alternatively, in the end-to-end variation detection network model proposed in the present embodiment, the warp layer may also be divided into two types. The first type is to use the optical flow level estimated in the same range to warp and align the feature map of the image B as the matched object with the feature map of the image a as the matched object. The aim is to enhance the alignment of the feature map, thereby improving the accuracy of the change detection; the second type is to warp the feature map of image B on the current layer using the optical flow estimated by the previous layer, roughly aligning the feature map of image B with the feature map of image a. This type of warp layer is applicable to all but the lowest layer. The aim is to achieve coarse alignment of feature maps, aimed at improving the accuracy of the optical flow estimation at the current level. This type performs coarse alignment using the optical flow information estimated from the previous stage, while obtaining accurate optical flow, while reducing the calculation effort and calculation time.

As an alternative implementation, in this embodiment, the feature alignment map includes two maps, namely a local correlation map and a global correlation map, and the pixel-level correspondence between feature maps is constructed by using two maps of different ranges. And then calling an optical flow estimator preset in the optical flow alignment module, and determining an optical flow estimation result according to the optical flow estimator and two mappings, wherein the optical flow estimation result also comprises a global optical flow estimation result and the local optical flow estimation result. And performing feature alignment on the multi-level feature atlas through a warp layer preset in the optical flow alignment module, a global optical flow estimation result and a local optical flow estimation result.

Because the feature alignment is an action between two matched unmanned aerial vehicle remote sensing images, for convenience of description, the two matched unmanned aerial vehicle remote sensing images are called a first unmanned aerial vehicle remote sensing image and a second unmanned aerial vehicle remote sensing image, a multi-level feature atlas corresponding to the first unmanned aerial vehicle remote sensing image is called a first multi-level feature atlas, and a multi-level feature atlas corresponding to the second unmanned aerial vehicle remote sensing image is called a second multi-level feature atlas.

The process will be described below:

and firstly, calculating a local correlation map and a global correlation map between the first multi-level feature atlas and the second multi-level feature atlas.

In this process, the difference between the local correlation map and the global correlation map is that the calculation ranges are different, for example, the local correlation map and the calculation are the correspondence within a specified distance'd' around the x-coordinate, and the global correlation map calculates the correspondence between the pixel "x" and all the pixels in the same image.

Illustratively, the definition of the correlation map is as follows:

c(x ₁ ，x ₂ )＝F _a (x ₁ ) ^T F _b (x ₂ )

wherein x is ₁ Representing a first multi-level feature atlas F _a Coordinates, x ₂ Representing a second multi-level feature atlas F _b Is a coordinate of (b) a coordinate of (c).

Wherein the local correlation map is |x ₁ -x ₂ D is not more than d, d is a specified distance. Alternatively, d may be set to 1. Whereas the global correlation map is not set with a specified distance.

And secondly, invoking an optical flow estimator, and then determining a global optical flow estimation result of the lowest resolution feature map in the multi-level feature map set based on the global correlation map and the optical flow estimator, and determining local optical flow estimation results of other feature maps except the lowest resolution feature map in the multi-level feature map set based on the local correlation map and the optical flow estimator. Note that the multi-level feature atlas here includes the first multi-level feature atlas and the second multi-level feature atlas.

It should be noted that in this process, since estimating optical flow using global correlation mapping on a high resolution feature map is a computationally intensive and time consuming process, only global correlation is used here to perform optical flow estimation on the lowest resolution feature map. For alignment of feature maps of other levels, then the local correlation mapping is used to estimate optical flow.

It should be further noted that, the determination of the optical flow estimation result is not different from the conventional optical flow estimation method using the optical flow estimator, which is different in that the optical flow estimation result in this embodiment is divided into different ranges (i.e. global or local), and different optical flow estimation methods are adopted for the feature maps with different resolutions in the multi-level feature map set, so that the accuracy of the estimation result is ensured as much as possible, and meanwhile, the calculation cost of the optical flow estimation module in the system is reduced.

And thirdly, moving the pixel points in the first multi-level feature image set and the second multi-level feature image set through the warping layer according to the global optical flow estimation result and the local optical flow estimation result to obtain a coarse alignment first unmanned aerial vehicle remote sensing image and a coarse alignment second unmanned aerial vehicle remote sensing image.

In the process, estimating pixel displacement between the whole image frames through a global optical flow estimation result, so as to determine information of the flying movement direction and speed of an object in the unmanned aerial vehicle remote sensing image; and the local optical flow estimation is used to estimate more detailed displacement information of local areas in the image, thereby capturing subtle changes in the image and relative motion between objects. And through the warping layer, the pixel points in the two image frames are moved based on the two optical flow estimation results so as to realize rough alignment.

Fourthly, determining the coarse alignment first unmanned aerial vehicle remote sensing image and the coarse alignment second unmanned aerial vehicle remote sensing image as the unmanned aerial vehicle remote sensing image combination.

And constructing and forming an unmanned aerial vehicle remote sensing image combination by using the two matched coarse alignment unmanned aerial vehicle remote sensing images, namely the coarse alignment first unmanned aerial vehicle remote sensing image and the coarse alignment second unmanned aerial vehicle remote sensing image, and continuously carrying out parallax change processing by taking the unmanned aerial vehicle remote sensing image combination as a unit in the subsequent steps.

Step S30, based on a progressive difference feature fusion and detection module, performing feature difference mapping processing on the unmanned aerial vehicle remote sensing image combination to eliminate repeated features in the unmanned aerial vehicle remote sensing image combination, and obtaining a multi-level feature difference map subjected to parallax change processing.

In this embodiment, after the unmanned aerial vehicle remote sensing image combination is obtained, the unmanned aerial vehicle remote sensing image combination is processed as the feature difference mapping based on the progressive difference feature fusion and detection module, so as to eliminate the repeated features in the unmanned aerial vehicle remote sensing image combination, and a multi-level feature difference map subjected to parallax change processing is obtained.

As an alternative embodiment, a feature difference mapping process is performed as to how this is done. Firstly, determining a characteristic diagram in a rough alignment second unmanned aerial vehicle remote sensing image, a characteristic difference diagram between the characteristic diagram and the characteristic diagram in a rough alignment first unmanned aerial vehicle remote sensing image, and calculating the absolute value of the characteristic difference diagram. The characteristic difference map can be obtained by subtracting the characteristic map of the corresponding level between the coarse alignment second unmanned aerial vehicle remote sensing image and the coarse alignment first unmanned aerial vehicle remote sensing image. And taking the absolute value of the characteristic difference graph to obtain the absolute value of the difference graph.

And then, carrying out up-sampling processing on the feature difference graph generated in the previous level to obtain an up-sampling feature difference graph. The purpose of this step is to improve the resolution of the feature difference map. Alternatively, the upsampling process may be performed using a nearest neighbor difference or bilinear difference.

And finally, fusing the absolute value of the difference map and the upsampling characteristic difference map through an attention mechanism and multi-layer convolution processing to obtain a multi-layer characteristic difference map.

Note that the role of the attention mechanism is to direct the focus of the model to locally important information. The attention mechanism includes a channel attention mechanism and a spatial attention mechanism.

By channel attention mechanism, it is meant that the scalability of the network model is improved by learning the attention pattern of one channel to enhance or suppress different channels so that the network model can incorporate different channels, enhancing the end-to-end variation.

It is noted that feature difference maps generated at different levels contain different information, and shallow layers produce feature difference maps with fine positional details and object-specific features, making them particularly effective in detecting small objects. In contrast, feature difference maps from deeper layers have rich semantic information making them more suitable for detecting larger targets. To take advantage of these different levels, the present embodiment fuses their feature difference maps using a channel attention mechanism, which helps reduce the difference in semantic information and location information across different channels.

Illustratively, the channel note is defined as follows:

M _c ＝σ(MLP(Avgpool(F))+MLP(Maxpool(F))

wherein sigma represents an s-type function, F represents a fused multi-level feature difference graph, MLP represents a multi-level perceptron, and the MLP is used for learning channel weights so as to determine the importance of each channel in a channel attention mechanism.

Whereas spatial attention mechanisms refer to individual pixels in a feature image, by learning a spatial attention pattern, allowing a particular pixel in the feature image to be enhanced or suppressed so that the end-to-end variation detection network model can focus on key pixels of interest in the feature image. However, these constantly changing key pixels typically constitute only a small portion of the overall spatial attention pattern. Therefore, the spatial attention mechanism is adopted in the present embodiment to learn the spatial attention pattern, so that the model can distribute more attention to the pixels that have undergone the change, which helps to increase the sensitivity of the model to the pixels that are most important in the change detection task.

Illustratively, the spatial attention is intended to be defined as follows:

M _s ＝σ(f ^7*7 ([Avgpool(F)；MLP(Maxpool(F)]))

wherein f ^7*7 Representing a convolution operator with a filter size of 7x 7. And F represents a fused multi-level characteristic difference graph.

In the technical scheme provided by the embodiment, an end-to-end change detection network model mounted on an unmanned aerial vehicle parallax change processing system is provided, at least two unmanned aerial vehicle remote sensing images with visual angle differences are preprocessed in the end-to-end change detection network model through a feature pyramid module, and the visual angle difference processing capacity of the network model on the unmanned aerial vehicle remote sensing images is improved; and then, aligning the feature images in an optical flow estimation mode, accurately detecting whether the visual angle difference exists between the remote sensing images of the unmanned aerial vehicles even if the corresponding pixels among the remote sensing images of the unmanned aerial vehicles are not divided into the same positions, and when the visual angle difference is detected, processing the images with the visual point difference through a progressive difference feature fusion and detection module, improving the performance of an unmanned aerial vehicle system on a scene with the accumulated error or large parallax change of the remote sensing images, improving the throughput by eliminating repeated feature extraction, avoiding the introduction of errors in a change detection area, and simultaneously getting rid of the dependence on the size of a grid unit of the model in a traditional visual difference change detection model, thereby realizing the image change processing with larger visual angle difference.

Second embodiment

Referring to fig. 3, based on the first embodiment, before the step S10, the method further includes:

step S40, based on an endpoint error loss function, constraining optical flow error loss between the unmanned aerial vehicle remote sensing image combination after characteristic alignment compared with the unmanned aerial vehicle remote sensing image combination acquired initially;

as an alternative embodiment, in order to improve the processing capability of the end-to-end variation detection network model on images with larger viewpoint differences, an endpoint error loss function (EPE) is used as a monitoring mechanism to restrict the optical flow error loss between the unmanned aerial vehicle remote sensing image combination after feature alignment compared with the unmanned aerial vehicle remote sensing image acquired initially, so as to ensure that the end-to-end variation detection network model can accurately predict the optical flow.

The end point error loss function is used in this embodiment to calculate the euclidean distance between the predicted optical flow and the ground-truth optical flow. Illustratively, the endpoint error loss function is defined as follows:

wherein,level I predictive optical flow representing end-to-end variation detection network model predictions, +.>Then ground real optical flow as reference value, +.>The weight parameters are lost for endpoint error.

Referring to fig. 4, based on any of the foregoing embodiments, before step S10, the method further includes:

And S50, restraining characteristic errors between deep features of the multi-level characteristic difference map and the input unmanned aerial vehicle remote sensing image based on a binary cross entropy loss function.

Further, in this embodiment, since the final variation graph is obtained by combining the multi-level feature difference graphs generated by the variation detection networks of different levels. The quality of the feature difference maps of each level directly affects the overall performance of the final change map, and the effectiveness of the feature difference maps of the next level is also affected by the feature difference maps of the previous level. Therefore, an error occurring at a certain level in the deep feature difference map can amplify an error in the subsequent level feature difference map, and the problem of gradient disappearance in the back propagation process occurs, which brings challenges to the training process. In order to avoid poor performance caused by accumulated errors in the long-term working process of the end-to-end variation detection network model, in the embodiment, a depth supervision strategy based on a Binary Cross Entropy (BCE) loss function is implemented in the network, and the generated characteristic difference map is processed on each level of the variation detection network to obtain a final variation map.

Illustratively, the binary cross entropy loss function is defined as follows:

Wherein GT ^l (x) A label, p, referring to pixel x of level i ^l (x) Refers to the prediction of the first level pixel x by the network,the weight parameter is lost for binary cross entropy.

Finally, in this embodiment, the total loss for the entire training process is defined as the sum of the end point error loss and the binary cross entropy loss:

L _{total (S)} ＝L _epe +L _bce

In the technical scheme provided by the embodiment, in order to improve the processing capability of the end-to-end variation detection network model on images with larger viewpoint differences and avoid poor performance caused by accumulated errors in the long-term working process of the end-to-end variation detection network model, an endpoint error loss function (EPE) is used as a monitoring mechanism to restrict optical flow error loss between unmanned aerial vehicle remote sensing image combination after feature alignment compared with the unmanned aerial vehicle remote sensing images acquired initially, and a binary cross entropy loss function is used as a depth supervision strategy to restrict feature errors between deep features of a multi-level feature difference map and the input unmanned aerial vehicle remote sensing images.

Third embodiment

Referring to fig. 5, before step S10, according to any embodiment, the method further includes:

step S60, acquiring a remote sensing image training data set, and enhancing the remote sensing image training data set through random overturn to obtain a preprocessed remote sensing image training data set;

Step S70, determining a garbage scattering area in the preprocessed remote sensing image training data set, and performing random affine transformation processing on image pairs in the garbage scattering area to enhance viewpoint differences in the preprocessed remote sensing image training data set;

step S80, training an end-to-end change detection network model in the unmanned aerial vehicle based on the preprocessed remote sensing image training dataset after the viewpoint difference is enhanced;

as an alternative embodiment, in order to improve the prediction accuracy of the end-to-end variation detection network model, it is necessary to train the variation detection network model with a large number of training samples, however, a training data set for variation detection of the variation difference of the view angle difference is not disclosed. In this regard, in the present embodiment, a training data set is proposed that is capable of training the viewpoint difference detection capability and the optical flow supervision capability of the model.

In most publicly available change detection data sets, an image a as a matching target, an image B as a matched target, and a change map in which the image a and the image B are fused are generally included. However, when the image a or the image B undergoes affine transformation such as rotation or translation, since the position of the object in the image is changed, the accuracy of the fused change map may be affected. This may have an adverse effect on the performance of the model. This limitation makes most publicly available change detection datasets unsuitable for creating change detection datasets with view point differences.

In this embodiment, the remote sensing image training dataset includes a change map obtained by synthesizing a garbage distribution image with a viewpoint difference and a remote sensing image collected by an unmanned aerial vehicle, and optical flow information corresponding to the garbage distribution image, the remote sensing image and the change map.

Specifically, a large number of remote sensing images collected by an unmanned aerial vehicle are collected to form a remote sensing image training data set, and the remote sensing image training data set is enhanced through random overturning, so that a preprocessed remote sensing image training data set is obtained. The purpose of random flipping is to enhance the diversity and robustness of the data.

By way of example only, and in an illustrative,

a garbage distribution area in the pre-processed remote sensing image training dataset is then determined. The garbage spreading area refers to a distribution area of garbage of interest in an image, and the determination of the area can be performed by adopting a common convolutional neural network algorithm.

In some embodiments, the application scenario of the unmanned aerial vehicle may be a garbage scattering area detection scenario, that is, the unmanned aerial vehicle collects a remote sensing image during a low-altitude flight, and identifies a garbage area in the image through the remote sensing image, where the garbage scattering area may be a garbage scattering area. When applied in the context of a drone detecting a waste distribution area, a waste distribution image is generated by randomly combining object images of various materials, including but not limited to plastics, metals, glass, and the like.

It should be noted that, in a real scene, a pair of images taken by the unmanned aerial vehicle at the same location but at different times naturally show a viewpoint difference. In order to copy such a view difference in the simulated data set, a random affine transformation process is performed on the image portion corresponding to the garbage scattering area to enhance the view difference between the images in the data set.

And to promote the optical flow alignment capability of the optical flow detection modules in the end-to-end variation detection network model, optical flow data required to align these pairs of images is also provided in the dataset. Namely, the remote sensing image training data set also needs to comprise a change map obtained by synthesizing the garbage distribution image with the viewpoint difference and the remote sensing image acquired by the unmanned aerial vehicle, and optical flow information corresponding to the garbage distribution image, the remote sensing image and the change map.

And finally, training the end-to-end change detection network model in the unmanned aerial vehicle based on the preprocessed remote sensing image training data set after the viewpoint difference is enhanced.

In the technical scheme provided by the embodiment, a training data set capable of training the viewpoint difference detection capability and the optical flow supervision capability of the model is provided, and a construction flow of the training data set is provided, so that the prediction accuracy of the end-to-end change detection network model is improved.

In addition, referring to fig. 6, the present embodiment further proposes an end-to-end variation detecting network model, which includes:

the feature pyramid module 100 is configured to downsample at least two unmanned aerial vehicle remote sensing images acquired by the unmanned aerial vehicle from the garbage scattering area, so as to obtain a multi-level feature atlas corresponding to each unmanned aerial vehicle remote sensing image;

the optical flow alignment module 200 is configured to perform feature alignment mapping on the multi-level feature atlas corresponding to the two matched unmanned aerial vehicle remote sensing images, and obtain a pair of feature aligned unmanned aerial vehicle remote sensing image combinations, where the optical flow alignment module includes an optical flow estimator for optical flow estimation, and a warping layer for performing feature alignment between the multi-level feature atlas corresponding to the two matched unmanned aerial vehicle remote sensing images;

the progressive difference feature fusion and detection module 300 is configured to perform feature difference mapping processing on the unmanned aerial vehicle remote sensing image combination to eliminate repeated features in the unmanned aerial vehicle remote sensing image combination, and obtain a multi-level feature difference map subjected to parallax change processing.

Furthermore, it will be appreciated by those of ordinary skill in the art that implementing all or part of the processes in the methods of the above embodiments may be accomplished by computer programs to instruct related hardware. The computer program comprises program instructions, and the computer program may be stored in a storage medium, which is a computer readable storage medium. The program instructions are executed by at least one processor in the unmanned aerial vehicle parallax change processing system to implement the flow steps of the embodiments of the method described above.

Accordingly, the present invention also provides a computer-readable storage medium storing a remote sensing image parallax change processing program for unmanned aerial vehicle garbage distribution area detection, which when executed by a processor, implements the steps of the remote sensing image parallax change processing method for unmanned aerial vehicle garbage distribution area detection as described in the above embodiments.

The computer readable storage medium may be a usb disk, a removable hard disk, a Read-Only Memory (ROM), a magnetic disk, or an optical disk, etc. which may store the program code.

It should be noted that, because the storage medium provided in the embodiments of the present application is a storage medium used to implement the method in the embodiments of the present application, based on the method described in the embodiments of the present application, a person skilled in the art can understand the specific structure and the modification of the storage medium, and therefore, the description thereof is omitted herein. All storage media used in the methods of the embodiments of the present application are within the scope of protection intended in the present application.

It will be appreciated by those skilled in the art that embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

It should be noted that in the claims, any reference signs placed between parentheses shall not be construed as limiting the claim. The word "comprising" does not exclude the presence of elements or steps not listed in a claim. The word "a" or "an" preceding an element does not exclude the presence of a plurality of such elements. The invention may be implemented by means of hardware comprising several distinct elements, and by means of a suitably programmed computer. In the unit claims enumerating several means, several of these means may be embodied by one and the same item of hardware. The use of the words first, second, third, etc. do not denote any order. These words may be interpreted as names.

While preferred embodiments of the present invention have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. It is therefore intended that the following claims be interpreted as including the preferred embodiments and all such alterations and modifications as fall within the scope of the invention.

It will be apparent to those skilled in the art that various modifications and variations can be made to the present invention without departing from the spirit or scope of the invention. Thus, it is intended that the present invention also include such modifications and alterations insofar as they come within the scope of the appended claims or the equivalents thereof.

Claims

1. The remote sensing image parallax change processing method for detecting the unmanned aerial vehicle garbage scattering area is characterized by being applied to an unmanned aerial vehicle parallax change processing system, wherein the unmanned aerial vehicle parallax change processing system is provided with an end-to-end change detection network model, and the remote sensing image parallax change processing method for detecting the unmanned aerial vehicle garbage scattering area comprises the following steps:

2. The method of claim 1, wherein the feature pyramid module is built based on a VGG16 network module, and the step of downsampling at least two unmanned aerial vehicle remote sensing images acquired by an unmanned aerial vehicle from a garbage scattering area based on the feature pyramid module to obtain multi-level feature atlas corresponding to each unmanned aerial vehicle remote sensing image comprises:

3. The method of claim 1, wherein the two matched drone remote sensing images include a first drone remote sensing image and a second drone remote sensing image, the multi-level feature atlas includes a first multi-level feature atlas corresponding to the first drone remote sensing image and a second multi-level feature atlas corresponding to the second drone remote sensing image, the step of obtaining a combination of a pair of feature-aligned drone remote sensing images based on the optical flow alignment module by performing feature alignment mapping on the multi-level feature atlas corresponding to the two matched drone remote sensing images comprises:

invoking an optical flow estimator;

4. The method of claim 3, wherein the warp layers comprise a first type of warp layer and a second type of warp layer, the steps of moving pixels in the first multi-level feature atlas and the second multi-level feature atlas through the warp layers to obtain a coarse alignment first multi-level feature atlas, and coarse alignment second multi-level feature atlas based on the global optical flow estimation result and the local optical flow estimation result comprise:

5. The method of claim 1, wherein the unmanned aerial vehicle remote sensing image combination comprises a coarse aligned first unmanned aerial vehicle remote sensing image and a coarse aligned second unmanned aerial vehicle remote sensing image, wherein the step of combining the unmanned aerial vehicle remote sensing images based on a progressive difference feature fusion and detection module to perform feature difference mapping processing on the unmanned aerial vehicle remote sensing image combination to eliminate repeated features in the unmanned aerial vehicle remote sensing image combination and obtain a multi-level feature difference map subjected to parallax change processing comprises:

6. The method of claim 1, wherein the step of downsampling at least two unmanned aerial vehicle remote sensing images acquired by the unmanned aerial vehicle from the garbage scattering area based on the feature pyramid module to obtain a multi-level feature atlas corresponding to each unmanned aerial vehicle remote sensing image further comprises:

7. The method of claim 1, wherein the step of downsampling at least two unmanned aerial vehicle remote sensing images acquired by the unmanned aerial vehicle from the garbage scattering area based on the feature pyramid module to obtain a multi-level feature atlas corresponding to each unmanned aerial vehicle remote sensing image further comprises:

8. An end-to-end variation detection network model, the end-to-end variation detection network model comprising:

9. An unmanned aerial vehicle parallax change processing system, characterized in that the unmanned aerial vehicle parallax change processing system comprises: a memory, a processor, and a remote sensing image parallax change processing program stored on the memory and operable on the processor, the remote sensing image parallax change processing program for unmanned aerial vehicle garbage scattering area detection implementing the steps of the remote sensing image parallax change processing method for unmanned aerial vehicle garbage scattering area detection as claimed in any one of claims 1 to 7 when executed by the processor.

10. A computer-readable storage medium, wherein a remote sensing image parallax change processing program for unmanned aerial vehicle garbage scattering area detection is stored on the computer-readable storage medium, and the remote sensing image parallax change processing program for unmanned aerial vehicle garbage scattering area detection realizes the steps of the remote sensing image parallax change processing method for unmanned aerial vehicle garbage scattering area detection according to any one of claims 1 to 7 when executed by a processor.