CN117474885A - Remote sensing image parallax change processing method for unmanned aerial vehicle garbage scattering area detection - Google Patents

Remote sensing image parallax change processing method for unmanned aerial vehicle garbage scattering area detection Download PDF

Info

Publication number
CN117474885A
CN117474885A CN202311477869.0A CN202311477869A CN117474885A CN 117474885 A CN117474885 A CN 117474885A CN 202311477869 A CN202311477869 A CN 202311477869A CN 117474885 A CN117474885 A CN 117474885A
Authority
CN
China
Prior art keywords
remote sensing
unmanned aerial
aerial vehicle
sensing image
feature
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311477869.0A
Other languages
Chinese (zh)
Inventor
杨扬
董亚鑫
赵姗
白海城
邢林
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Yunnan Normal University
Original Assignee
Yunnan Normal University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Yunnan Normal University filed Critical Yunnan Normal University
Priority to CN202311477869.0A priority Critical patent/CN117474885A/en
Publication of CN117474885A publication Critical patent/CN117474885A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/97Determining parameters from multiple pictures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/269Analysis of motion using gradient-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/806Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes
    • G06V20/17Terrestrial scenes taken from planes or by drones
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10032Satellite or aerial image; Remote sensing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20228Disparity calculation for image-based rendering

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Multimedia (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Remote Sensing (AREA)
  • Image Processing (AREA)

Abstract

The invention relates to the technical field of computer vision, in particular to a remote sensing image parallax change processing method, device and storage medium for detecting a garbage scattering area of an unmanned aerial vehicle. The method comprises the following steps: based on a feature pyramid module, downsampling at least two unmanned aerial vehicle remote sensing images acquired by an unmanned aerial vehicle from a garbage scattering area to obtain a multi-level feature atlas corresponding to each unmanned aerial vehicle remote sensing image; based on an optical flow alignment module, performing feature alignment mapping on the multi-level feature image sets corresponding to the two matched unmanned aerial vehicle remote sensing images to obtain a pair of unmanned aerial vehicle remote sensing image combinations with the aligned features; based on a progressive difference feature fusion and detection module, the unmanned aerial vehicle remote sensing image combination is processed as feature difference mapping so as to eliminate repeated features in the unmanned aerial vehicle remote sensing image combination, and a multi-level feature difference map subjected to parallax change processing is obtained. The method aims at solving the problem of how to perform parallax change processing on the unmanned aerial vehicle remote sensing image with the visual angle difference.

Description

Remote sensing image parallax change processing method for unmanned aerial vehicle garbage scattering area detection
Technical Field
The invention relates to the technical field of computer vision, in particular to a remote sensing image parallax change processing method for detecting a garbage scattering area of an unmanned aerial vehicle.
Background
The small unmanned aerial vehicle plays an important role in earth remote sensing observation and has the advantages of cooperative technology, easiness in operation, high-resolution image acquisition, flexibility and the like. When the small unmanned aerial vehicle captures images at the same location but at different times, the acquired images inevitably have a viewing angle difference due to factors such as inaccurate GPS (Global Positioning System) signals, changes in flight attitude, complex wind speeds and directions, geographical environments, and the like.
For solving the problem of detecting the change of the image with the visual angle difference, in the related technical scheme, a weak supervision semantic scene change detection model is adopted to detect the change of the image with the visual angle difference. Dividing an image into grid cells, extracting characteristics of the grid cells by using a Convolutional Neural Network (CNN), calculating Euclidean distances among the characteristics of the grid cells, and judging whether the image in a grid area is changed or not according to the Euclidean distances.
However, the inventors have found in the course of conception and implementation of the present application that: when the unmanned aerial vehicle is applied to garbage scattering area detection and the conventional parallax change detection model is adopted for change detection, the image change with the visual angle difference can be accurately detected only when the corresponding pixels are divided into grid cells at the same position. In addition, since the capability of processing an image having a viewing angle difference depends on the size of the grid cell, when the size of the grid cell is increased in order to process an image having a large viewing angle difference, there is a case where a change detection area introduces an error, and therefore, the method is difficult to be applied to the image change processing having a large viewing angle difference, and has a defect of insufficient applicability.
The foregoing is provided merely for the purpose of facilitating understanding of the technical solutions of the present invention and is not intended to represent an admission that the foregoing is prior art.
Disclosure of Invention
The invention mainly aims to provide a remote sensing image parallax change processing method for detecting a garbage scattering area of an unmanned aerial vehicle, and aims to solve the problem of how to perform parallax change processing on remote sensing images of the unmanned aerial vehicle with visual angle differences.
In order to achieve the above object, the present invention provides a method for processing parallax change of a remote sensing image for detecting a garbage scattering area of an unmanned aerial vehicle, the method comprising:
based on a feature pyramid module, downsampling at least two unmanned aerial vehicle remote sensing images acquired by an unmanned aerial vehicle from a garbage scattering area to obtain a multi-level feature atlas corresponding to each unmanned aerial vehicle remote sensing image;
based on an optical flow alignment module, performing feature alignment mapping on the multi-level feature atlas corresponding to the two matched unmanned aerial vehicle remote sensing images to obtain a pair of unmanned aerial vehicle remote sensing image combinations with aligned features, wherein the optical flow alignment module comprises an optical flow estimator for optical flow estimation and a warping layer for performing feature alignment between the multi-level feature atlas corresponding to the two matched unmanned aerial vehicle remote sensing images;
Based on a progressive difference feature fusion and detection module, the unmanned aerial vehicle remote sensing image combination is processed as feature difference mapping so as to eliminate repeated features in the unmanned aerial vehicle remote sensing image combination, and a multi-level feature difference map subjected to parallax change processing is obtained.
Optionally, the feature pyramid module is built based on a VGG16 network module, and the step of downsampling at least two unmanned aerial vehicle remote sensing images acquired by the unmanned aerial vehicle from the garbage scattering area based on the feature pyramid module to obtain a multi-level feature atlas corresponding to each unmanned aerial vehicle remote sensing image includes:
performing feature extraction on the unmanned aerial vehicle remote sensing image to obtain a plurality of feature images with different layers, and forming the multi-level feature atlas corresponding to the unmanned aerial vehicle remote sensing image;
the number of channels of the feature map located at the next layer is twice the number of channels of the feature map located at the adjacent upper layer.
Optionally, the two matched unmanned aerial vehicle remote sensing images include a first unmanned aerial vehicle remote sensing image and a second unmanned aerial vehicle remote sensing image, the multi-level feature atlas includes a first multi-level feature atlas corresponding to the first unmanned aerial vehicle remote sensing image, and a second multi-level feature atlas corresponding to the second unmanned aerial vehicle remote sensing image, the step of obtaining the unmanned aerial vehicle remote sensing image combination after feature alignment includes:
Calculating a local correlation map and a global correlation map between the first multi-level feature atlas and the second multi-level feature atlas;
invoking an optical flow estimator;
determining a global optical flow estimation result of a lowest resolution feature map in the multi-level feature map set based on the global correlation map and the optical flow estimator, and determining a local optical flow estimation result of other feature maps except the lowest resolution feature map in the multi-level feature map set based on the local correlation map and the optical flow estimator, wherein the multi-level feature map set comprises the first multi-level feature map set and the second multi-level feature map set;
according to the global optical flow estimation result and the local optical flow estimation result, moving the pixel points in the first multi-level feature image set and the second multi-level feature image set through the warping layer to obtain a coarse alignment first unmanned aerial vehicle remote sensing image and a coarse alignment second unmanned aerial vehicle remote sensing image;
and determining the coarse alignment first unmanned aerial vehicle remote sensing image and the coarse alignment second unmanned aerial vehicle remote sensing image as the unmanned aerial vehicle remote sensing image combination.
Optionally, the warp layer includes a first type warp layer and a second type warp layer, and the step of obtaining a coarse alignment first multi-level feature map set and a coarse alignment second multi-level feature map set by moving pixels in the first multi-level feature map set and the second multi-level feature map set through the warp layer according to the global optical flow estimation result and the local optical flow estimation result includes:
Twisting the feature images of the second multi-level feature atlas through the first warp layer so as to align the feature images of the second multi-level feature atlas with the first multi-level feature atlas, thereby obtaining the coarse alignment first multi-level feature atlas and the coarse alignment second multi-level feature atlas; and/or the number of the groups of groups,
using an optical flow estimation result of a previous layer to warp a feature map of a current layer through the second type of warping layer so as to align the feature map of each layer of the second multi-level feature map set with the feature map of a corresponding level in the first multi-level feature map set, thereby obtaining the coarse alignment first multi-level feature map set and the coarse alignment second multi-level feature map set, wherein the optical flow estimation result comprises the global optical flow estimation result and the local optical flow estimation result;
the second type of warping layer is applied to the multi-level feature map set, and other layers of feature maps except the bottommost layer of feature map are used.
Optionally, the unmanned aerial vehicle remote sensing image combination includes a coarse alignment first unmanned aerial vehicle remote sensing image and a coarse alignment second unmanned aerial vehicle remote sensing image, the step of obtaining the multi-level feature difference map through parallax change processing based on progressive difference feature fusion and detection module, wherein the unmanned aerial vehicle remote sensing image combination is processed as feature difference mapping to eliminate repeated features in the unmanned aerial vehicle remote sensing image combination, and the step of obtaining the multi-level feature difference map through parallax change processing includes:
Determining a characteristic map in the coarse alignment second unmanned aerial vehicle remote sensing image, a characteristic difference map between the characteristic map in the coarse alignment first unmanned aerial vehicle remote sensing image, and calculating a difference map absolute value of the characteristic difference map;
performing up-sampling processing on the feature difference image generated in the previous level to obtain an up-sampling feature difference image;
and fusing the absolute value of the difference map and the upsampled characteristic difference map through an attention mechanism and multi-layer convolution processing to obtain the multi-layer characteristic difference map.
Optionally, before the step of obtaining the multi-level feature atlas corresponding to each unmanned aerial vehicle remote sensing image by downsampling at least two unmanned aerial vehicle remote sensing images collected from the garbage scattering area by the unmanned aerial vehicle based on the feature pyramid module, the method further includes:
based on the endpoint error loss function, constraining optical flow error loss between the unmanned aerial vehicle remote sensing image combination after characteristic alignment compared with the unmanned aerial vehicle remote sensing image combination acquired initially;
and constraining characteristic errors between deep features of the multi-level feature difference map and the input unmanned aerial vehicle remote sensing image based on a binary cross entropy loss function.
Optionally, before the step of obtaining the multi-level feature atlas corresponding to each unmanned aerial vehicle remote sensing image by downsampling at least two unmanned aerial vehicle remote sensing images collected from the garbage scattering area by the unmanned aerial vehicle based on the feature pyramid module, the method further includes:
Acquiring a remote sensing image training data set, and enhancing the remote sensing image training data set through random overturn to obtain a preprocessed remote sensing image training data set;
determining a garbage scattering area in the preprocessed remote sensing image training data set, and performing random affine transformation processing on image pairs in the garbage scattering area to enhance viewpoint differences in the preprocessed remote sensing image training data set;
training a preset neural network in the unmanned aerial vehicle based on the preprocessed remote sensing image training data set after the viewpoint difference is enhanced;
the remote sensing image training data set comprises a change map obtained by synthesizing a garbage distribution image with viewpoint difference and a remote sensing image acquired by an unmanned aerial vehicle, and optical flow information corresponding to the garbage distribution image, the remote sensing image and the change map.
In addition, to achieve the above object, the present invention further provides an end-to-end variation detecting network model, including:
the feature pyramid module is used for downsampling at least two unmanned aerial vehicle remote sensing images acquired by the unmanned aerial vehicle from the garbage scattering area to obtain multi-level feature atlas corresponding to each unmanned aerial vehicle remote sensing image;
The optical flow alignment module is used for carrying out feature alignment mapping on the multi-level feature image sets corresponding to the two matched unmanned aerial vehicle remote sensing images to obtain a pair of unmanned aerial vehicle remote sensing image combinations with the aligned features, wherein the optical flow alignment module comprises an optical flow estimator for optical flow estimation and a warping layer for carrying out feature alignment on the multi-level feature image sets corresponding to the two matched unmanned aerial vehicle remote sensing images;
the progressive difference feature fusion and detection module is used for combining the unmanned aerial vehicle remote sensing images and performing feature difference mapping processing on the combined unmanned aerial vehicle remote sensing images so as to eliminate repeated features in the unmanned aerial vehicle remote sensing images and obtain a multi-level feature difference image subjected to parallax change processing.
In addition, in order to achieve the above object, the present invention also provides an unmanned aerial vehicle parallax change processing system, including: the remote sensing image parallax change processing method comprises the steps of a memory, a processor and a remote sensing image parallax change processing program which is stored in the memory and can be operated on the processor and used for detecting the unmanned aerial vehicle garbage scattering area, wherein the remote sensing image parallax change processing program for detecting the unmanned aerial vehicle garbage scattering area is executed by the processor and is used for realizing the remote sensing image parallax change processing method for detecting the unmanned aerial vehicle garbage scattering area.
In addition, in order to achieve the above object, the present invention also provides a computer-readable storage medium having stored thereon a remote sensing image parallax change processing program for unmanned aerial vehicle garbage scattering area detection, which when executed by a processor, implements the steps of the remote sensing image parallax change processing method for unmanned aerial vehicle garbage scattering area detection as described above.
The embodiment of the invention provides a remote sensing image parallax change processing method, device and storage medium for unmanned aerial vehicle garbage scattering area detection, which aim at a characteristic image by providing an end-to-end change detection network model and adopting an optical flow estimation mode, and simultaneously execute change detection, so that a network can directly process images with viewpoint differences, the performance of an unmanned aerial vehicle system on a scene with accumulated errors or large parallax change of the remote sensing image is effectively improved, and the throughput is improved by eliminating repeated characteristic extraction.
Drawings
Fig. 1 is a schematic architecture diagram of a hardware operating environment of an unmanned aerial vehicle parallax change processing system according to an embodiment of the present invention;
fig. 2 is a flowchart of a first embodiment of a remote sensing image parallax change processing method for detecting a garbage scattering area of an unmanned aerial vehicle according to the present invention;
Fig. 3 is a schematic flow chart of a second embodiment of a remote sensing image parallax change processing method for detecting a garbage scattering area of an unmanned aerial vehicle according to the present invention;
fig. 4 is a schematic diagram of another flow involved in a second embodiment of a remote sensing image parallax change processing method for detecting a garbage scattering area of an unmanned aerial vehicle according to the present invention;
fig. 5 is a schematic flow chart of a third embodiment of a remote sensing image parallax change processing method for detecting a garbage scattering area of an unmanned aerial vehicle according to the present invention;
fig. 6 is a schematic diagram of an end-to-end variation detection network model related to a remote sensing image parallax variation processing method for detecting a garbage scattering area of an unmanned aerial vehicle.
The achievement of the objects, functional features and advantages of the present invention will be further described with reference to the accompanying drawings, in conjunction with the embodiments.
Detailed Description
According to the method, the end-to-end change detection network model which is mounted on the unmanned aerial vehicle parallax change processing system is provided, at least two unmanned aerial vehicle remote sensing images with visual angle differences are preprocessed through the feature pyramid module in the end-to-end change detection network model, and the visual angle difference processing capacity of the network model on the unmanned aerial vehicle remote sensing images is improved; and then, aligning the feature images in an optical flow estimation mode, accurately detecting whether the visual angle difference exists between the remote sensing images of the unmanned aerial vehicles even if the corresponding pixels among the remote sensing images of the unmanned aerial vehicles are not divided into the same positions, and when the visual angle difference is detected, processing the images with the visual point difference through a progressive difference feature fusion and detection module, improving the performance of an unmanned aerial vehicle system on a scene with the accumulated error or large parallax change of the remote sensing images, improving the throughput by eliminating repeated feature extraction, avoiding the introduction of errors in a change detection area, and simultaneously getting rid of the dependence on the size of a grid unit of the model in a traditional visual difference change detection model, thereby realizing the image change processing with larger visual angle difference.
In order to better understand the above technical solution, exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
As an implementation scheme, fig. 1 is a schematic architecture diagram of a hardware running environment of an unmanned aerial vehicle parallax change processing system according to an embodiment of the present invention.
As shown in fig. 1, the unmanned aerial vehicle parallax change processing system may include: a processor 1001, such as a CPU, memory 1005, user interface 1003, network interface 1004, communication bus 1002. Wherein the communication bus 1002 is used to enable connected communication between these components. The user interface 1003 may include a Display, an input unit such as a Keyboard (Keyboard), and the optional user interface 1003 may further include a standard wired interface, a wireless interface. The network interface 1004 may optionally include a standard wired interface, a wireless interface (e.g., WI-FI interface). The memory 1005 may be a high-speed RAM memory or a stable memory (non-volatile memory), such as a disk memory. The memory 1005 may also optionally be a storage device separate from the processor 1001 described above.
Those skilled in the art will appreciate that the unmanned aerial vehicle parallax change processing system architecture shown in fig. 1 is not limiting of the unmanned aerial vehicle parallax change processing system, and may include more or fewer components than illustrated, or may combine certain components, or a different arrangement of components.
As shown in fig. 1, a memory 1005 as a storage medium may include an operating system, a network communication module, a user interface module, and a remote sensing image parallax change processing program for unmanned aerial vehicle garbage distribution area detection. The operation system is a program for managing and controlling hardware and software resources of the unmanned aerial vehicle parallax change processing system, and a remote sensing image parallax change processing program for detecting the unmanned aerial vehicle garbage scattering area and other software or program operations.
In the unmanned plane parallax change processing system shown in fig. 1, the user interface 1003 is mainly used for connecting a terminal, and performs data communication with the terminal; the network interface 1004 is mainly used for a background server and is in data communication with the background server; the processor 1001 may be configured to invoke a remote sensing image parallax change processing program for unmanned aerial vehicle garbage distribution area detection stored in the memory 1005.
In this embodiment, the unmanned aerial vehicle parallax change processing system includes: a memory 1005, a processor 1001, and a remote sensing image parallax change processing program stored on the memory and operable on the processor for unmanned aerial vehicle garbage scattering area detection, wherein:
when the processor 1001 calls a remote sensing image parallax change processing program for unmanned aerial vehicle garbage scattering area detection stored in the memory 1005, it executes the following operations:
based on a feature pyramid module, downsampling at least two unmanned aerial vehicle remote sensing images acquired by an unmanned aerial vehicle from a garbage scattering area to obtain a multi-level feature atlas corresponding to each unmanned aerial vehicle remote sensing image;
based on an optical flow alignment module, performing feature alignment mapping on the multi-level feature atlas corresponding to the two matched unmanned aerial vehicle remote sensing images to obtain a pair of unmanned aerial vehicle remote sensing image combinations with aligned features, wherein the optical flow alignment module comprises an optical flow estimator for optical flow estimation and a warping layer for performing feature alignment between the multi-level feature atlas corresponding to the two matched unmanned aerial vehicle remote sensing images;
based on a progressive difference feature fusion and detection module, the unmanned aerial vehicle remote sensing image combination is processed as feature difference mapping so as to eliminate repeated features in the unmanned aerial vehicle remote sensing image combination, and a multi-level feature difference map subjected to parallax change processing is obtained.
When the processor 1001 calls a remote sensing image parallax change processing program for unmanned aerial vehicle garbage scattering area detection stored in the memory 1005, it executes the following operations:
performing feature extraction on the unmanned aerial vehicle remote sensing image to obtain a plurality of feature images with different layers, and forming the multi-level feature atlas corresponding to the unmanned aerial vehicle remote sensing image;
the number of channels of the feature map located at the next layer is twice the number of channels of the feature map located at the adjacent upper layer.
When the processor 1001 calls a remote sensing image parallax change processing program for unmanned aerial vehicle garbage scattering area detection stored in the memory 1005, it executes the following operations:
calculating a local correlation map and a global correlation map between the first multi-level feature atlas and the second multi-level feature atlas;
invoking an optical flow estimator;
determining a global optical flow estimation result of a lowest resolution feature map in the multi-level feature map set based on the global correlation map and the optical flow estimator, and determining a local optical flow estimation result of other feature maps except the lowest resolution feature map in the multi-level feature map set based on the local correlation map and the optical flow estimator, wherein the multi-level feature map set comprises the first multi-level feature map set and the second multi-level feature map set;
According to the global optical flow estimation result and the local optical flow estimation result, moving the pixel points in the first multi-level feature image set and the second multi-level feature image set through the warping layer to obtain a coarse alignment first unmanned aerial vehicle remote sensing image and a coarse alignment second unmanned aerial vehicle remote sensing image;
and determining the coarse alignment first unmanned aerial vehicle remote sensing image and the coarse alignment second unmanned aerial vehicle remote sensing image as the unmanned aerial vehicle remote sensing image combination.
When the processor 1001 calls a remote sensing image parallax change processing program for unmanned aerial vehicle garbage scattering area detection stored in the memory 1005, it executes the following operations:
twisting the feature images of the second multi-level feature atlas through the first warp layer so as to align the feature images of the second multi-level feature atlas with the first multi-level feature atlas, thereby obtaining the coarse alignment first multi-level feature atlas and the coarse alignment second multi-level feature atlas; and/or the number of the groups of groups,
using an optical flow estimation result of a previous layer to warp a feature map of a current layer through the second type of warping layer so as to align the feature map of each layer of the second multi-level feature map set with the feature map of a corresponding level in the first multi-level feature map set, thereby obtaining the coarse alignment first multi-level feature map set and the coarse alignment second multi-level feature map set, wherein the optical flow estimation result comprises the global optical flow estimation result and the local optical flow estimation result;
The second type of warping layer is applied to the multi-level feature map set, and other layers of feature maps except the bottommost layer of feature map are used.
When the processor 1001 calls a remote sensing image parallax change processing program for unmanned aerial vehicle garbage scattering area detection stored in the memory 1005, it executes the following operations:
determining a characteristic map in the coarse alignment second unmanned aerial vehicle remote sensing image, a characteristic difference map between the characteristic map in the coarse alignment first unmanned aerial vehicle remote sensing image, and calculating a difference map absolute value of the characteristic difference map;
performing up-sampling processing on the feature difference image generated in the previous level to obtain an up-sampling feature difference image;
and fusing the absolute value of the difference map and the upsampled characteristic difference map through an attention mechanism and multi-layer convolution processing to obtain the multi-layer characteristic difference map.
When the processor 1001 calls a remote sensing image parallax change processing program for unmanned aerial vehicle garbage scattering area detection stored in the memory 1005, it executes the following operations:
based on the endpoint error loss function, constraining optical flow error loss between the unmanned aerial vehicle remote sensing image combination after characteristic alignment compared with the unmanned aerial vehicle remote sensing image combination acquired initially;
and constraining characteristic errors between deep features of the multi-level feature difference map and the input unmanned aerial vehicle remote sensing image based on a binary cross entropy loss function.
When the processor 1001 calls a remote sensing image parallax change processing program for unmanned aerial vehicle garbage scattering area detection stored in the memory 1005, it executes the following operations:
acquiring a remote sensing image training data set, and enhancing the remote sensing image training data set through random overturn to obtain a preprocessed remote sensing image training data set;
determining a garbage scattering area in the preprocessed remote sensing image training data set, and performing random affine transformation processing on image pairs in the garbage scattering area to enhance viewpoint differences in the preprocessed remote sensing image training data set;
training a preset neural network in the unmanned aerial vehicle based on the preprocessed remote sensing image training data set after the viewpoint difference is enhanced;
the remote sensing image training data set comprises a change map obtained by synthesizing a garbage distribution image with viewpoint difference and a remote sensing image acquired by an unmanned aerial vehicle, and optical flow information corresponding to the garbage distribution image, the remote sensing image and the change map.
Based on the hardware architecture of the unmanned aerial vehicle parallax change processing system based on the computer vision technology, the embodiment of the remote sensing image parallax change processing method for detecting the unmanned aerial vehicle garbage scattering area is provided.
First embodiment
In this embodiment, the unmanned aerial vehicle parallax change processing system is equipped with the end-to-end change detection network model provided in this embodiment, and the end-to-end change detection network model mainly includes three major modules: the device comprises a feature pyramid module, an optical flow alignment module and a progressive difference feature fusion and detection module.
The feature pyramid module is mainly used for carrying out downsampling on the remote sensing image of the unmanned aerial vehicle, so that the processing speed of the end-to-end change detection network model is improved while a feature map with richer semantic information is obtained, and the subsequent module can be conveniently identified.
The main purpose of the optical flow alignment module is to align the multi-level feature atlas obtained after the feature pyramid module is processed, so that the performance deterioration of the whole unmanned aerial vehicle parallax change processing system caused by accumulated errors in the unmanned aerial vehicle image acquisition process is avoided, and the reduction of the system throughput caused by extracting excessive redundant features is avoided.
Because the image frames in the unmanned aerial vehicle remote sensing image combination obtained after the alignment of the optical flow alignment module are multi-level feature images, feature repeated parts exist among different layers, the progressive difference feature fusion and detection module provided by the embodiment is adopted, the feature repeated parts among the image frames inside the combination are eliminated, the multi-level feature difference images after parallax change processing are generated, and the parallax change processing of the unmanned aerial vehicle remote sensing image is completed.
Referring to fig. 2, in this embodiment, the remote sensing image parallax change processing method for detecting the garbage scattering area of the unmanned aerial vehicle includes the following steps:
step S10, based on a feature pyramid module, downsampling at least two unmanned aerial vehicle remote sensing images acquired by an unmanned aerial vehicle from a garbage scattering area to obtain multi-level feature atlas corresponding to each unmanned aerial vehicle remote sensing image;
in this embodiment, the unmanned aerial vehicle carries out the low altitude flight in the top of rubbish scattering area, sets up the monocular camera on the unmanned aerial vehicle and takes the remote sensing image of below rubbish scattering area, unmanned aerial vehicle remote sensing image promptly, in the unmanned aerial vehicle parallax change processing system of unmanned aerial vehicle built-in the input unmanned aerial vehicle remote sensing image to this embodiment, carries out parallax change processing to unmanned aerial vehicle remote sensing image. Because parallax change occurs based on two image frames continuously acquired in the unmanned aerial vehicle flight process, we need to acquire at least two unmanned aerial vehicle remote sensing images first.
In the step, the feature pyramid module performs downsampling on at least two input unmanned aerial vehicle remote sensing images to obtain a plurality of multi-level feature images corresponding to the unmanned aerial vehicle remote sensing images, and a multi-level feature image set is formed.
As an alternative embodiment, the downsampling is specifically as follows:
firstly, the unmanned aerial vehicle remote sensing image is used for carrying out feature extraction to obtain a plurality of feature images with different layers, and the multi-level feature image set corresponding to the unmanned aerial vehicle remote sensing image is formed. The number of channels of the feature images at the next layer is twice that of the feature images at the adjacent upper layer in the extracted feature images at a plurality of different layers.
Alternatively, the feature pyramid module may be built by using a VGG16 network, that is, a VGG (Visual Geometry Group ) network with 16 convolution layers, formed by alternately stacking 16 convolution layers and a plurality of pooling layers, and finally classifying by using a fully-connected layer.
For example, let the extracted multi-level feature map be 4 layers, and the initial unmanned aerial vehicle remote sensing image contains 512 channels. Firstly, in order to reduce the number of network parameters, the number of channels in an initial unmanned aerial vehicle remote sensing image is adjusted to 256, namely half of the original number, then 4 different layers of feature images are extracted from an input image frame, the feature images encapsulate different semantic information, the sizes of the 4 layers of feature images are respectively 1/2, 1/4, 1/8 and 1/16 of the size of the input image, and the number of channels of each feature image is respectively 64, 128, 256 and 256.
The multi-level feature map extracted by the feature pyramid module contains deep semantic information and shallow and accurate position details, and fusion of the depth and the shallow feature map can contain rich semantic information and accurate position information.
Step S20, performing feature alignment mapping on the multi-level feature atlas corresponding to the two matched unmanned aerial vehicle remote sensing images based on an optical flow alignment module to obtain a pair of unmanned aerial vehicle remote sensing image combinations with aligned features, wherein the optical flow alignment module comprises an optical flow estimator for optical flow estimation and a warping layer for performing feature alignment between the multi-level feature atlas corresponding to the two matched unmanned aerial vehicle remote sensing images;
in this embodiment, after the multi-level feature atlas is obtained, feature pairs Ji Yingshe are performed on the multi-level feature atlas based on the optical flow alignment module. Optical flow is an effective method of establishing dense correspondence between a pair of images, by which alignment of image feature maps is achieved.
The optical flow alignment module comprises an optical flow estimator for optical flow estimation and a warping layer for performing feature alignment between multi-level feature atlas corresponding to the two matched unmanned aerial vehicle remote sensing images.
The optical flow estimator consists of a multi-layer convolutional neural network for estimating pixel displacement between successive frames with the objective of finding a displacement vector for each pixel point in adjacent image frames to describe the direction and speed of movement of an object in the image. In this embodiment, the number of channels in the input is gradually reduced, and optical flow information with two channels is finally generated, wherein the two channels respectively represent the horizontal displacement and the vertical displacement of the pixel in coordinates. Alternatively, the optical flow estimator may consist of 6 convolution layers for the optical flow calculation design, all using convolution kernels of 3x3 size.
The warping layer refers to a layer used for executing warping operation (warping) in the end-to-end variation detection network model, and is responsible for deforming the feature images of two matched images so as to match an image B serving as a matched object with an image A serving as a matched object in the two images, thereby realizing the alignment between the two feature images.
Illustratively, the warp layer is defined as follows:
F (x)=F b (x+of(x))
where x is F of image B as the matched object Of (x) is the result of optical flow estimation.
Alternatively, in the end-to-end variation detection network model proposed in the present embodiment, the warp layer may also be divided into two types. The first type is to use the optical flow level estimated in the same range to warp and align the feature map of the image B as the matched object with the feature map of the image a as the matched object. The aim is to enhance the alignment of the feature map, thereby improving the accuracy of the change detection; the second type is to warp the feature map of image B on the current layer using the optical flow estimated by the previous layer, roughly aligning the feature map of image B with the feature map of image a. This type of warp layer is applicable to all but the lowest layer. The aim is to achieve coarse alignment of feature maps, aimed at improving the accuracy of the optical flow estimation at the current level. This type performs coarse alignment using the optical flow information estimated from the previous stage, while obtaining accurate optical flow, while reducing the calculation effort and calculation time.
As an alternative implementation, in this embodiment, the feature alignment map includes two maps, namely a local correlation map and a global correlation map, and the pixel-level correspondence between feature maps is constructed by using two maps of different ranges. And then calling an optical flow estimator preset in the optical flow alignment module, and determining an optical flow estimation result according to the optical flow estimator and two mappings, wherein the optical flow estimation result also comprises a global optical flow estimation result and the local optical flow estimation result. And performing feature alignment on the multi-level feature atlas through a warp layer preset in the optical flow alignment module, a global optical flow estimation result and a local optical flow estimation result.
Because the feature alignment is an action between two matched unmanned aerial vehicle remote sensing images, for convenience of description, the two matched unmanned aerial vehicle remote sensing images are called a first unmanned aerial vehicle remote sensing image and a second unmanned aerial vehicle remote sensing image, a multi-level feature atlas corresponding to the first unmanned aerial vehicle remote sensing image is called a first multi-level feature atlas, and a multi-level feature atlas corresponding to the second unmanned aerial vehicle remote sensing image is called a second multi-level feature atlas.
The process will be described below:
and firstly, calculating a local correlation map and a global correlation map between the first multi-level feature atlas and the second multi-level feature atlas.
In this process, the difference between the local correlation map and the global correlation map is that the calculation ranges are different, for example, the local correlation map and the calculation are the correspondence within a specified distance'd' around the x-coordinate, and the global correlation map calculates the correspondence between the pixel "x" and all the pixels in the same image.
Illustratively, the definition of the correlation map is as follows:
c(x 1 ,x 2 )=F a (x 1 ) T F b (x 2 )
wherein x is 1 Representing a first multi-level feature atlas F a Coordinates, x 2 Representing a second multi-level feature atlas F b Is a coordinate of (b) a coordinate of (c).
Wherein the local correlation map is |x 1 -x 2 D is not more than d, d is a specified distance. Alternatively, d may be set to 1. Whereas the global correlation map is not set with a specified distance.
And secondly, invoking an optical flow estimator, and then determining a global optical flow estimation result of the lowest resolution feature map in the multi-level feature map set based on the global correlation map and the optical flow estimator, and determining local optical flow estimation results of other feature maps except the lowest resolution feature map in the multi-level feature map set based on the local correlation map and the optical flow estimator. Note that the multi-level feature atlas here includes the first multi-level feature atlas and the second multi-level feature atlas.
It should be noted that in this process, since estimating optical flow using global correlation mapping on a high resolution feature map is a computationally intensive and time consuming process, only global correlation is used here to perform optical flow estimation on the lowest resolution feature map. For alignment of feature maps of other levels, then the local correlation mapping is used to estimate optical flow.
It should be further noted that, the determination of the optical flow estimation result is not different from the conventional optical flow estimation method using the optical flow estimator, which is different in that the optical flow estimation result in this embodiment is divided into different ranges (i.e. global or local), and different optical flow estimation methods are adopted for the feature maps with different resolutions in the multi-level feature map set, so that the accuracy of the estimation result is ensured as much as possible, and meanwhile, the calculation cost of the optical flow estimation module in the system is reduced.
And thirdly, moving the pixel points in the first multi-level feature image set and the second multi-level feature image set through the warping layer according to the global optical flow estimation result and the local optical flow estimation result to obtain a coarse alignment first unmanned aerial vehicle remote sensing image and a coarse alignment second unmanned aerial vehicle remote sensing image.
In the process, estimating pixel displacement between the whole image frames through a global optical flow estimation result, so as to determine information of the flying movement direction and speed of an object in the unmanned aerial vehicle remote sensing image; and the local optical flow estimation is used to estimate more detailed displacement information of local areas in the image, thereby capturing subtle changes in the image and relative motion between objects. And through the warping layer, the pixel points in the two image frames are moved based on the two optical flow estimation results so as to realize rough alignment.
Fourthly, determining the coarse alignment first unmanned aerial vehicle remote sensing image and the coarse alignment second unmanned aerial vehicle remote sensing image as the unmanned aerial vehicle remote sensing image combination.
And constructing and forming an unmanned aerial vehicle remote sensing image combination by using the two matched coarse alignment unmanned aerial vehicle remote sensing images, namely the coarse alignment first unmanned aerial vehicle remote sensing image and the coarse alignment second unmanned aerial vehicle remote sensing image, and continuously carrying out parallax change processing by taking the unmanned aerial vehicle remote sensing image combination as a unit in the subsequent steps.
Step S30, based on a progressive difference feature fusion and detection module, performing feature difference mapping processing on the unmanned aerial vehicle remote sensing image combination to eliminate repeated features in the unmanned aerial vehicle remote sensing image combination, and obtaining a multi-level feature difference map subjected to parallax change processing.
In this embodiment, after the unmanned aerial vehicle remote sensing image combination is obtained, the unmanned aerial vehicle remote sensing image combination is processed as the feature difference mapping based on the progressive difference feature fusion and detection module, so as to eliminate the repeated features in the unmanned aerial vehicle remote sensing image combination, and a multi-level feature difference map subjected to parallax change processing is obtained.
As an alternative embodiment, a feature difference mapping process is performed as to how this is done. Firstly, determining a characteristic diagram in a rough alignment second unmanned aerial vehicle remote sensing image, a characteristic difference diagram between the characteristic diagram and the characteristic diagram in a rough alignment first unmanned aerial vehicle remote sensing image, and calculating the absolute value of the characteristic difference diagram. The characteristic difference map can be obtained by subtracting the characteristic map of the corresponding level between the coarse alignment second unmanned aerial vehicle remote sensing image and the coarse alignment first unmanned aerial vehicle remote sensing image. And taking the absolute value of the characteristic difference graph to obtain the absolute value of the difference graph.
And then, carrying out up-sampling processing on the feature difference graph generated in the previous level to obtain an up-sampling feature difference graph. The purpose of this step is to improve the resolution of the feature difference map. Alternatively, the upsampling process may be performed using a nearest neighbor difference or bilinear difference.
And finally, fusing the absolute value of the difference map and the upsampling characteristic difference map through an attention mechanism and multi-layer convolution processing to obtain a multi-layer characteristic difference map.
Note that the role of the attention mechanism is to direct the focus of the model to locally important information. The attention mechanism includes a channel attention mechanism and a spatial attention mechanism.
By channel attention mechanism, it is meant that the scalability of the network model is improved by learning the attention pattern of one channel to enhance or suppress different channels so that the network model can incorporate different channels, enhancing the end-to-end variation.
It is noted that feature difference maps generated at different levels contain different information, and shallow layers produce feature difference maps with fine positional details and object-specific features, making them particularly effective in detecting small objects. In contrast, feature difference maps from deeper layers have rich semantic information making them more suitable for detecting larger targets. To take advantage of these different levels, the present embodiment fuses their feature difference maps using a channel attention mechanism, which helps reduce the difference in semantic information and location information across different channels.
Illustratively, the channel note is defined as follows:
M c =σ(MLP(Avgpool(F))+MLP(Maxpool(F))
wherein sigma represents an s-type function, F represents a fused multi-level feature difference graph, MLP represents a multi-level perceptron, and the MLP is used for learning channel weights so as to determine the importance of each channel in a channel attention mechanism.
Whereas spatial attention mechanisms refer to individual pixels in a feature image, by learning a spatial attention pattern, allowing a particular pixel in the feature image to be enhanced or suppressed so that the end-to-end variation detection network model can focus on key pixels of interest in the feature image. However, these constantly changing key pixels typically constitute only a small portion of the overall spatial attention pattern. Therefore, the spatial attention mechanism is adopted in the present embodiment to learn the spatial attention pattern, so that the model can distribute more attention to the pixels that have undergone the change, which helps to increase the sensitivity of the model to the pixels that are most important in the change detection task.
Illustratively, the spatial attention is intended to be defined as follows:
M s =σ(f 7*7 ([Avgpool(F);MLP(Maxpool(F)]))
wherein f 7*7 Representing a convolution operator with a filter size of 7x 7. And F represents a fused multi-level characteristic difference graph.
In the technical scheme provided by the embodiment, an end-to-end change detection network model mounted on an unmanned aerial vehicle parallax change processing system is provided, at least two unmanned aerial vehicle remote sensing images with visual angle differences are preprocessed in the end-to-end change detection network model through a feature pyramid module, and the visual angle difference processing capacity of the network model on the unmanned aerial vehicle remote sensing images is improved; and then, aligning the feature images in an optical flow estimation mode, accurately detecting whether the visual angle difference exists between the remote sensing images of the unmanned aerial vehicles even if the corresponding pixels among the remote sensing images of the unmanned aerial vehicles are not divided into the same positions, and when the visual angle difference is detected, processing the images with the visual point difference through a progressive difference feature fusion and detection module, improving the performance of an unmanned aerial vehicle system on a scene with the accumulated error or large parallax change of the remote sensing images, improving the throughput by eliminating repeated feature extraction, avoiding the introduction of errors in a change detection area, and simultaneously getting rid of the dependence on the size of a grid unit of the model in a traditional visual difference change detection model, thereby realizing the image change processing with larger visual angle difference.
Second embodiment
Referring to fig. 3, based on the first embodiment, before the step S10, the method further includes:
step S40, based on an endpoint error loss function, constraining optical flow error loss between the unmanned aerial vehicle remote sensing image combination after characteristic alignment compared with the unmanned aerial vehicle remote sensing image combination acquired initially;
as an alternative embodiment, in order to improve the processing capability of the end-to-end variation detection network model on images with larger viewpoint differences, an endpoint error loss function (EPE) is used as a monitoring mechanism to restrict the optical flow error loss between the unmanned aerial vehicle remote sensing image combination after feature alignment compared with the unmanned aerial vehicle remote sensing image acquired initially, so as to ensure that the end-to-end variation detection network model can accurately predict the optical flow.
The end point error loss function is used in this embodiment to calculate the euclidean distance between the predicted optical flow and the ground-truth optical flow. Illustratively, the endpoint error loss function is defined as follows:
wherein,level I predictive optical flow representing end-to-end variation detection network model predictions, +.>Then ground real optical flow as reference value, +.>The weight parameters are lost for endpoint error.
Referring to fig. 4, based on any of the foregoing embodiments, before step S10, the method further includes:
And S50, restraining characteristic errors between deep features of the multi-level characteristic difference map and the input unmanned aerial vehicle remote sensing image based on a binary cross entropy loss function.
Further, in this embodiment, since the final variation graph is obtained by combining the multi-level feature difference graphs generated by the variation detection networks of different levels. The quality of the feature difference maps of each level directly affects the overall performance of the final change map, and the effectiveness of the feature difference maps of the next level is also affected by the feature difference maps of the previous level. Therefore, an error occurring at a certain level in the deep feature difference map can amplify an error in the subsequent level feature difference map, and the problem of gradient disappearance in the back propagation process occurs, which brings challenges to the training process. In order to avoid poor performance caused by accumulated errors in the long-term working process of the end-to-end variation detection network model, in the embodiment, a depth supervision strategy based on a Binary Cross Entropy (BCE) loss function is implemented in the network, and the generated characteristic difference map is processed on each level of the variation detection network to obtain a final variation map.
Illustratively, the binary cross entropy loss function is defined as follows:
Wherein GT l (x) A label, p, referring to pixel x of level i l (x) Refers to the prediction of the first level pixel x by the network,the weight parameter is lost for binary cross entropy.
Finally, in this embodiment, the total loss for the entire training process is defined as the sum of the end point error loss and the binary cross entropy loss:
L total (S) =L epe +L bce
In the technical scheme provided by the embodiment, in order to improve the processing capability of the end-to-end variation detection network model on images with larger viewpoint differences and avoid poor performance caused by accumulated errors in the long-term working process of the end-to-end variation detection network model, an endpoint error loss function (EPE) is used as a monitoring mechanism to restrict optical flow error loss between unmanned aerial vehicle remote sensing image combination after feature alignment compared with the unmanned aerial vehicle remote sensing images acquired initially, and a binary cross entropy loss function is used as a depth supervision strategy to restrict feature errors between deep features of a multi-level feature difference map and the input unmanned aerial vehicle remote sensing images.
Third embodiment
Referring to fig. 5, before step S10, according to any embodiment, the method further includes:
step S60, acquiring a remote sensing image training data set, and enhancing the remote sensing image training data set through random overturn to obtain a preprocessed remote sensing image training data set;
Step S70, determining a garbage scattering area in the preprocessed remote sensing image training data set, and performing random affine transformation processing on image pairs in the garbage scattering area to enhance viewpoint differences in the preprocessed remote sensing image training data set;
step S80, training an end-to-end change detection network model in the unmanned aerial vehicle based on the preprocessed remote sensing image training dataset after the viewpoint difference is enhanced;
as an alternative embodiment, in order to improve the prediction accuracy of the end-to-end variation detection network model, it is necessary to train the variation detection network model with a large number of training samples, however, a training data set for variation detection of the variation difference of the view angle difference is not disclosed. In this regard, in the present embodiment, a training data set is proposed that is capable of training the viewpoint difference detection capability and the optical flow supervision capability of the model.
In most publicly available change detection data sets, an image a as a matching target, an image B as a matched target, and a change map in which the image a and the image B are fused are generally included. However, when the image a or the image B undergoes affine transformation such as rotation or translation, since the position of the object in the image is changed, the accuracy of the fused change map may be affected. This may have an adverse effect on the performance of the model. This limitation makes most publicly available change detection datasets unsuitable for creating change detection datasets with view point differences.
In this embodiment, the remote sensing image training dataset includes a change map obtained by synthesizing a garbage distribution image with a viewpoint difference and a remote sensing image collected by an unmanned aerial vehicle, and optical flow information corresponding to the garbage distribution image, the remote sensing image and the change map.
Specifically, a large number of remote sensing images collected by an unmanned aerial vehicle are collected to form a remote sensing image training data set, and the remote sensing image training data set is enhanced through random overturning, so that a preprocessed remote sensing image training data set is obtained. The purpose of random flipping is to enhance the diversity and robustness of the data.
By way of example only, and in an illustrative,
a garbage distribution area in the pre-processed remote sensing image training dataset is then determined. The garbage spreading area refers to a distribution area of garbage of interest in an image, and the determination of the area can be performed by adopting a common convolutional neural network algorithm.
In some embodiments, the application scenario of the unmanned aerial vehicle may be a garbage scattering area detection scenario, that is, the unmanned aerial vehicle collects a remote sensing image during a low-altitude flight, and identifies a garbage area in the image through the remote sensing image, where the garbage scattering area may be a garbage scattering area. When applied in the context of a drone detecting a waste distribution area, a waste distribution image is generated by randomly combining object images of various materials, including but not limited to plastics, metals, glass, and the like.
It should be noted that, in a real scene, a pair of images taken by the unmanned aerial vehicle at the same location but at different times naturally show a viewpoint difference. In order to copy such a view difference in the simulated data set, a random affine transformation process is performed on the image portion corresponding to the garbage scattering area to enhance the view difference between the images in the data set.
And to promote the optical flow alignment capability of the optical flow detection modules in the end-to-end variation detection network model, optical flow data required to align these pairs of images is also provided in the dataset. Namely, the remote sensing image training data set also needs to comprise a change map obtained by synthesizing the garbage distribution image with the viewpoint difference and the remote sensing image acquired by the unmanned aerial vehicle, and optical flow information corresponding to the garbage distribution image, the remote sensing image and the change map.
And finally, training the end-to-end change detection network model in the unmanned aerial vehicle based on the preprocessed remote sensing image training data set after the viewpoint difference is enhanced.
In the technical scheme provided by the embodiment, a training data set capable of training the viewpoint difference detection capability and the optical flow supervision capability of the model is provided, and a construction flow of the training data set is provided, so that the prediction accuracy of the end-to-end change detection network model is improved.
In addition, referring to fig. 6, the present embodiment further proposes an end-to-end variation detecting network model, which includes:
the feature pyramid module 100 is configured to downsample at least two unmanned aerial vehicle remote sensing images acquired by the unmanned aerial vehicle from the garbage scattering area, so as to obtain a multi-level feature atlas corresponding to each unmanned aerial vehicle remote sensing image;
the optical flow alignment module 200 is configured to perform feature alignment mapping on the multi-level feature atlas corresponding to the two matched unmanned aerial vehicle remote sensing images, and obtain a pair of feature aligned unmanned aerial vehicle remote sensing image combinations, where the optical flow alignment module includes an optical flow estimator for optical flow estimation, and a warping layer for performing feature alignment between the multi-level feature atlas corresponding to the two matched unmanned aerial vehicle remote sensing images;
the progressive difference feature fusion and detection module 300 is configured to perform feature difference mapping processing on the unmanned aerial vehicle remote sensing image combination to eliminate repeated features in the unmanned aerial vehicle remote sensing image combination, and obtain a multi-level feature difference map subjected to parallax change processing.
Furthermore, it will be appreciated by those of ordinary skill in the art that implementing all or part of the processes in the methods of the above embodiments may be accomplished by computer programs to instruct related hardware. The computer program comprises program instructions, and the computer program may be stored in a storage medium, which is a computer readable storage medium. The program instructions are executed by at least one processor in the unmanned aerial vehicle parallax change processing system to implement the flow steps of the embodiments of the method described above.
Accordingly, the present invention also provides a computer-readable storage medium storing a remote sensing image parallax change processing program for unmanned aerial vehicle garbage distribution area detection, which when executed by a processor, implements the steps of the remote sensing image parallax change processing method for unmanned aerial vehicle garbage distribution area detection as described in the above embodiments.
The computer readable storage medium may be a usb disk, a removable hard disk, a Read-Only Memory (ROM), a magnetic disk, or an optical disk, etc. which may store the program code.
It should be noted that, because the storage medium provided in the embodiments of the present application is a storage medium used to implement the method in the embodiments of the present application, based on the method described in the embodiments of the present application, a person skilled in the art can understand the specific structure and the modification of the storage medium, and therefore, the description thereof is omitted herein. All storage media used in the methods of the embodiments of the present application are within the scope of protection intended in the present application.
It will be appreciated by those skilled in the art that embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
It should be noted that in the claims, any reference signs placed between parentheses shall not be construed as limiting the claim. The word "comprising" does not exclude the presence of elements or steps not listed in a claim. The word "a" or "an" preceding an element does not exclude the presence of a plurality of such elements. The invention may be implemented by means of hardware comprising several distinct elements, and by means of a suitably programmed computer. In the unit claims enumerating several means, several of these means may be embodied by one and the same item of hardware. The use of the words first, second, third, etc. do not denote any order. These words may be interpreted as names.
While preferred embodiments of the present invention have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. It is therefore intended that the following claims be interpreted as including the preferred embodiments and all such alterations and modifications as fall within the scope of the invention.
It will be apparent to those skilled in the art that various modifications and variations can be made to the present invention without departing from the spirit or scope of the invention. Thus, it is intended that the present invention also include such modifications and alterations insofar as they come within the scope of the appended claims or the equivalents thereof.

Claims (10)

1. The remote sensing image parallax change processing method for detecting the unmanned aerial vehicle garbage scattering area is characterized by being applied to an unmanned aerial vehicle parallax change processing system, wherein the unmanned aerial vehicle parallax change processing system is provided with an end-to-end change detection network model, and the remote sensing image parallax change processing method for detecting the unmanned aerial vehicle garbage scattering area comprises the following steps:
based on a feature pyramid module, downsampling at least two unmanned aerial vehicle remote sensing images acquired by an unmanned aerial vehicle from a garbage scattering area to obtain a multi-level feature atlas corresponding to each unmanned aerial vehicle remote sensing image;
Based on an optical flow alignment module, performing feature alignment mapping on the multi-level feature atlas corresponding to the two matched unmanned aerial vehicle remote sensing images to obtain a pair of unmanned aerial vehicle remote sensing image combinations with aligned features, wherein the optical flow alignment module comprises an optical flow estimator for optical flow estimation and a warping layer for performing feature alignment between the multi-level feature atlas corresponding to the two matched unmanned aerial vehicle remote sensing images;
based on a progressive difference feature fusion and detection module, the unmanned aerial vehicle remote sensing image combination is processed as feature difference mapping so as to eliminate repeated features in the unmanned aerial vehicle remote sensing image combination, and a multi-level feature difference map subjected to parallax change processing is obtained.
2. The method of claim 1, wherein the feature pyramid module is built based on a VGG16 network module, and the step of downsampling at least two unmanned aerial vehicle remote sensing images acquired by an unmanned aerial vehicle from a garbage scattering area based on the feature pyramid module to obtain multi-level feature atlas corresponding to each unmanned aerial vehicle remote sensing image comprises:
performing feature extraction on the unmanned aerial vehicle remote sensing image to obtain a plurality of feature images with different layers, and forming the multi-level feature atlas corresponding to the unmanned aerial vehicle remote sensing image;
The number of channels of the feature map located at the next layer is twice the number of channels of the feature map located at the adjacent upper layer.
3. The method of claim 1, wherein the two matched drone remote sensing images include a first drone remote sensing image and a second drone remote sensing image, the multi-level feature atlas includes a first multi-level feature atlas corresponding to the first drone remote sensing image and a second multi-level feature atlas corresponding to the second drone remote sensing image, the step of obtaining a combination of a pair of feature-aligned drone remote sensing images based on the optical flow alignment module by performing feature alignment mapping on the multi-level feature atlas corresponding to the two matched drone remote sensing images comprises:
calculating a local correlation map and a global correlation map between the first multi-level feature atlas and the second multi-level feature atlas;
invoking an optical flow estimator;
determining a global optical flow estimation result of a lowest resolution feature map in the multi-level feature map set based on the global correlation map and the optical flow estimator, and determining a local optical flow estimation result of other feature maps except the lowest resolution feature map in the multi-level feature map set based on the local correlation map and the optical flow estimator, wherein the multi-level feature map set comprises the first multi-level feature map set and the second multi-level feature map set;
According to the global optical flow estimation result and the local optical flow estimation result, moving the pixel points in the first multi-level feature image set and the second multi-level feature image set through the warping layer to obtain a coarse alignment first unmanned aerial vehicle remote sensing image and a coarse alignment second unmanned aerial vehicle remote sensing image;
and determining the coarse alignment first unmanned aerial vehicle remote sensing image and the coarse alignment second unmanned aerial vehicle remote sensing image as the unmanned aerial vehicle remote sensing image combination.
4. The method of claim 3, wherein the warp layers comprise a first type of warp layer and a second type of warp layer, the steps of moving pixels in the first multi-level feature atlas and the second multi-level feature atlas through the warp layers to obtain a coarse alignment first multi-level feature atlas, and coarse alignment second multi-level feature atlas based on the global optical flow estimation result and the local optical flow estimation result comprise:
twisting the feature images of the second multi-level feature atlas through the first warp layer so as to align the feature images of the second multi-level feature atlas with the first multi-level feature atlas, thereby obtaining the coarse alignment first multi-level feature atlas and the coarse alignment second multi-level feature atlas; and/or the number of the groups of groups,
Using an optical flow estimation result of a previous layer to warp a feature map of a current layer through the second type of warping layer so as to align the feature map of each layer of the second multi-level feature map set with the feature map of a corresponding level in the first multi-level feature map set, thereby obtaining the coarse alignment first multi-level feature map set and the coarse alignment second multi-level feature map set, wherein the optical flow estimation result comprises the global optical flow estimation result and the local optical flow estimation result;
the second type of warping layer is applied to the multi-level feature map set, and other layers of feature maps except the bottommost layer of feature map are used.
5. The method of claim 1, wherein the unmanned aerial vehicle remote sensing image combination comprises a coarse aligned first unmanned aerial vehicle remote sensing image and a coarse aligned second unmanned aerial vehicle remote sensing image, wherein the step of combining the unmanned aerial vehicle remote sensing images based on a progressive difference feature fusion and detection module to perform feature difference mapping processing on the unmanned aerial vehicle remote sensing image combination to eliminate repeated features in the unmanned aerial vehicle remote sensing image combination and obtain a multi-level feature difference map subjected to parallax change processing comprises:
determining a characteristic map in the coarse alignment second unmanned aerial vehicle remote sensing image, a characteristic difference map between the characteristic map in the coarse alignment first unmanned aerial vehicle remote sensing image, and calculating a difference map absolute value of the characteristic difference map;
Performing up-sampling processing on the feature difference image generated in the previous level to obtain an up-sampling feature difference image;
and fusing the absolute value of the difference map and the upsampled characteristic difference map through an attention mechanism and multi-layer convolution processing to obtain the multi-layer characteristic difference map.
6. The method of claim 1, wherein the step of downsampling at least two unmanned aerial vehicle remote sensing images acquired by the unmanned aerial vehicle from the garbage scattering area based on the feature pyramid module to obtain a multi-level feature atlas corresponding to each unmanned aerial vehicle remote sensing image further comprises:
based on the endpoint error loss function, constraining optical flow error loss between the unmanned aerial vehicle remote sensing image combination after characteristic alignment compared with the unmanned aerial vehicle remote sensing image combination acquired initially;
and constraining characteristic errors between deep features of the multi-level feature difference map and the input unmanned aerial vehicle remote sensing image based on a binary cross entropy loss function.
7. The method of claim 1, wherein the step of downsampling at least two unmanned aerial vehicle remote sensing images acquired by the unmanned aerial vehicle from the garbage scattering area based on the feature pyramid module to obtain a multi-level feature atlas corresponding to each unmanned aerial vehicle remote sensing image further comprises:
Acquiring a remote sensing image training data set, and enhancing the remote sensing image training data set through random overturn to obtain a preprocessed remote sensing image training data set;
determining a garbage scattering area in the preprocessed remote sensing image training data set, and performing random affine transformation processing on image pairs in the garbage scattering area to enhance viewpoint differences in the preprocessed remote sensing image training data set;
training a preset neural network in the unmanned aerial vehicle based on the preprocessed remote sensing image training data set after the viewpoint difference is enhanced;
the remote sensing image training data set comprises a change map obtained by synthesizing a garbage distribution image with viewpoint difference and a remote sensing image acquired by an unmanned aerial vehicle, and optical flow information corresponding to the garbage distribution image, the remote sensing image and the change map.
8. An end-to-end variation detection network model, the end-to-end variation detection network model comprising:
the feature pyramid module is used for downsampling at least two unmanned aerial vehicle remote sensing images acquired by the unmanned aerial vehicle from the garbage scattering area to obtain multi-level feature atlas corresponding to each unmanned aerial vehicle remote sensing image;
The optical flow alignment module is used for carrying out feature alignment mapping on the multi-level feature image sets corresponding to the two matched unmanned aerial vehicle remote sensing images to obtain a pair of unmanned aerial vehicle remote sensing image combinations with the aligned features, wherein the optical flow alignment module comprises an optical flow estimator for optical flow estimation and a warping layer for carrying out feature alignment on the multi-level feature image sets corresponding to the two matched unmanned aerial vehicle remote sensing images;
the progressive difference feature fusion and detection module is used for combining the unmanned aerial vehicle remote sensing images and performing feature difference mapping processing on the combined unmanned aerial vehicle remote sensing images so as to eliminate repeated features in the unmanned aerial vehicle remote sensing images and obtain a multi-level feature difference image subjected to parallax change processing.
9. An unmanned aerial vehicle parallax change processing system, characterized in that the unmanned aerial vehicle parallax change processing system comprises: a memory, a processor, and a remote sensing image parallax change processing program stored on the memory and operable on the processor, the remote sensing image parallax change processing program for unmanned aerial vehicle garbage scattering area detection implementing the steps of the remote sensing image parallax change processing method for unmanned aerial vehicle garbage scattering area detection as claimed in any one of claims 1 to 7 when executed by the processor.
10. A computer-readable storage medium, wherein a remote sensing image parallax change processing program for unmanned aerial vehicle garbage scattering area detection is stored on the computer-readable storage medium, and the remote sensing image parallax change processing program for unmanned aerial vehicle garbage scattering area detection realizes the steps of the remote sensing image parallax change processing method for unmanned aerial vehicle garbage scattering area detection according to any one of claims 1 to 7 when executed by a processor.
CN202311477869.0A 2023-11-08 2023-11-08 Remote sensing image parallax change processing method for unmanned aerial vehicle garbage scattering area detection Pending CN117474885A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311477869.0A CN117474885A (en) 2023-11-08 2023-11-08 Remote sensing image parallax change processing method for unmanned aerial vehicle garbage scattering area detection

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311477869.0A CN117474885A (en) 2023-11-08 2023-11-08 Remote sensing image parallax change processing method for unmanned aerial vehicle garbage scattering area detection

Publications (1)

Publication Number Publication Date
CN117474885A true CN117474885A (en) 2024-01-30

Family

ID=89630874

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311477869.0A Pending CN117474885A (en) 2023-11-08 2023-11-08 Remote sensing image parallax change processing method for unmanned aerial vehicle garbage scattering area detection

Country Status (1)

Country Link
CN (1) CN117474885A (en)

Similar Documents

Publication Publication Date Title
CN108986136B (en) Binocular scene flow determination method and system based on semantic segmentation
CN111862126B (en) Non-cooperative target relative pose estimation method combining deep learning and geometric algorithm
CN108475433B (en) Method and system for large scale determination of RGBD camera poses
EP3516625B1 (en) A device and method for obtaining distance information from views
CN109829398B (en) Target detection method in video based on three-dimensional convolution network
CN104574347B (en) Satellite in orbit image geometry positioning accuracy evaluation method based on multi- source Remote Sensing Data data
WO2020052678A1 (en) Method and system for generating synthetic point cloud data using a generative model
Mahjourian et al. Geometry-based next frame prediction from monocular video
CN111209825B (en) Method and device for dynamic target 3D detection
CN110688905B (en) Three-dimensional object detection and tracking method based on key frame
CN110381268B (en) Method, device, storage medium and electronic equipment for generating video
CN109376641B (en) Moving vehicle detection method based on unmanned aerial vehicle aerial video
CN111091023B (en) Vehicle detection method and device and electronic equipment
Bu et al. Pedestrian planar LiDAR pose (PPLP) network for oriented pedestrian detection based on planar LiDAR and monocular images
CN111414931B (en) Multi-branch multi-scale small target detection method based on image depth
EP3216006B1 (en) An image processing apparatus and method
CN112287824A (en) Binocular vision-based three-dimensional target detection method, device and system
CN111738071B (en) Inverse perspective transformation method based on motion change of monocular camera
CN114170290A (en) Image processing method and related equipment
CN116883588A (en) Method and system for quickly reconstructing three-dimensional point cloud under large scene
CN116958927A (en) Method and device for identifying short column based on BEV (binary image) graph
CN116092035A (en) Lane line detection method, lane line detection device, computer equipment and storage medium
CN113920254B (en) Monocular RGB (Red Green blue) -based indoor three-dimensional reconstruction method and system thereof
CN113723432B (en) Intelligent identification and positioning tracking method and system based on deep learning
CN117474885A (en) Remote sensing image parallax change processing method for unmanned aerial vehicle garbage scattering area detection

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination