GB2595122A - Method and apparatus - Google Patents

Method and apparatus Download PDF

Info

Publication number
GB2595122A
GB2595122A GB2111302.2A GB202111302A GB2595122A GB 2595122 A GB2595122 A GB 2595122A GB 202111302 A GB202111302 A GB 202111302A GB 2595122 A GB2595122 A GB 2595122A
Authority
GB
United Kingdom
Prior art keywords
image
images
visible
earth
neural network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
GB2111302.2A
Other versions
GB2595122B (en
Inventor
Edward Geach James
James Smith Michael
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Hertfordshire
Original Assignee
University of Hertfordshire
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Hertfordshire filed Critical University of Hertfordshire
Priority to GB2111302.2A priority Critical patent/GB2595122B/en
Publication of GB2595122A publication Critical patent/GB2595122A/en
Application granted granted Critical
Publication of GB2595122B publication Critical patent/GB2595122B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S7/00Details of systems according to groups G01S13/00, G01S15/00, G01S17/00
    • G01S7/02Details of systems according to groups G01S13/00, G01S15/00, G01S17/00 of systems according to group G01S13/00
    • G01S7/41Details of systems according to groups G01S13/00, G01S15/00, G01S17/00 of systems according to group G01S13/00 using analysis of echo signal for target characterisation; Target signature; Target cross-section
    • G01S7/417Details of systems according to groups G01S13/00, G01S15/00, G01S17/00 of systems according to group G01S13/00 using analysis of echo signal for target characterisation; Target signature; Target cross-section involving the use of neural networks
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S13/00Systems using the reflection or reradiation of radio waves, e.g. radar systems; Analogous systems using reflection or reradiation of waves whose nature or wavelength is irrelevant or unspecified
    • G01S13/86Combinations of radar systems with non-radar systems, e.g. sonar, direction finder
    • G01S13/867Combination of radar systems with cameras
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S13/00Systems using the reflection or reradiation of radio waves, e.g. radar systems; Analogous systems using reflection or reradiation of waves whose nature or wavelength is irrelevant or unspecified
    • G01S13/88Radar or analogous systems specially adapted for specific applications
    • G01S13/89Radar or analogous systems specially adapted for specific applications for mapping or imaging
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S13/00Systems using the reflection or reradiation of radio waves, e.g. radar systems; Analogous systems using reflection or reradiation of waves whose nature or wavelength is irrelevant or unspecified
    • G01S13/88Radar or analogous systems specially adapted for specific applications
    • G01S13/89Radar or analogous systems specially adapted for specific applications for mapping or imaging
    • G01S13/90Radar or analogous systems specially adapted for specific applications for mapping or imaging using synthetic aperture techniques, e.g. synthetic aperture radar [SAR] techniques
    • G01S13/9021SAR image post-processing techniques
    • G01S13/9027Pattern recognition for feature extraction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/047Probabilistic or stochastic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/088Non-supervised learning, e.g. competitive learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/10Image acquisition
    • G06V10/12Details of acquisition arrangements; Constructional details thereof
    • G06V10/14Optical characteristics of the device performing the acquisition or on the illumination arrangements
    • G06V10/143Sensing or illuminating at different wavelengths
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes
    • G06V20/13Satellite images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes
    • G06V20/194Terrestrial scenes using hyperspectral data, i.e. more or other wavelengths than RGB

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Remote Sensing (AREA)
  • Radar, Positioning & Navigation (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Multimedia (AREA)
  • General Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Molecular Biology (AREA)
  • Astronomy & Astrophysics (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Electromagnetism (AREA)
  • Probability & Statistics with Applications (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Image Processing (AREA)
  • Investigating Or Analysing Materials By Optical Means (AREA)

Abstract

The invention relates to a method and apparatus for predicting images in the visible-infrared band of a region of the Earth's surface that would be observed by an Earth Observation (EO) satellite or other high-altitude imaging platform, using image data from radar reflectance/backscatter of the same region. A neural network, possibly having a generator and a discriminator, is trained to produce a mapping model. Additional information such as surface elevation and sun angle could be used in the training process. The method and apparatus can find application when the view between an optical imaging instrument (e.g. a camera) and the ground is obscured by cloud that is opaque to electromagnetic (EM) radiation in the visible-infrared spectral range, but is transparent to EM radiation in the radio-microwave part of the spectrum. Regular, uninterrupted monitoring of the Earth's surface is important for a wide range of applications, from agriculture (e.g. assessing crop growth) to defence (e.g. identifying military activity).

Description

Method and Apparatus
Field of the Invention
The present invention relates to a method and apparatus that can predict the visible-infrared band images of a region of the Earth's surface that would be observed by an Earth Observation (EO) satellite or other high-altitude imaging platform, using data from radar reflectance/backscatter of the same region. The method and apparatus can be used to predict images of the Earth's surface in the visible-infrared bands when the view between an imaging instrument (e.g. a camera) and the ground is obscured by cloud or some other medium that is opaque to electromagnetic (EM) radiation in the visible-infrared spectral range, approximately spanning 400-2300 nanometres (nm), but transparent to EM radiation in the radio-/microwave part of the spectrum.
Regular, uninterrupted monitoring of the Earth's surface is important for a wide range of applications, from agriculture (e.g. assessing crop growth, identifying signatures of drought or estimating yields) to defence (e.g. identifying changes in land use related to military activity or conflict).
Background of the Invention
EO satellites observe the surface of the Earth from orbit, delivering high-resolution images at different frequencies across the EM spectrum. The familiar combination of red, green, blue (RGB) bands yields an image akin to what the human eye sees, allowing one to visually distinguish (e.g.) lush pasture from a ploughed field. However, a rich variety of quantitative diagnostic information for remote sensing can be revealed from other combinations of observations across the visible-infrared spectral range.
For example, the red (R) and near-infrared (N IR) bands can be combined to calculate the so-called normalized difference vegetation index (NDVI), which is an established indicator for assessing the presence and density of vegetation or detecting freestanding water.
Repeated "visits" by EO satellites observing the same region on a regular basis (e.g. daily, weekly, etc.) allow one to monitor and map temporal changes. An example could be the monitoring of the growth of a crop: one can use imaging from an EO satellite to determine the optimal time for harvest, estimate the expected yield, indicate where to irrigate or apply fertilizer, or identify anomalies related to crop stress that might require attention from land users.
A significant challenge in EO remote sensing is cloud cover, which can obscure the Earth's surface for imaging across the key visible and infrared wavebands. This is a serious issue for applications where regular and/or uninterrupted imaging is essential, such as monitoring during the peak growth cycle of a crop, or the rapid identification and mapping of environmental hazards such as flooding, or obtaining accurate intelligence regarding changes in land use. When the surface is obscured by cloud, direct imaging in the visible-infrared bands from any platform at an altitude equal to or above the cloud layer cannot be used. One would like a prediction of the view in the visible-infrared bands in these cases.
EO satellites that operate in the radio and microwave bands (e.g. the C-band at 4-8 GHz) of the EM spectrum can image the Earth's surface using Synthetic Aperture Radar (SAR), detecting the backscatter of the EM waves. Radar observations are not affected by cloud cover because clouds are transparent to radio waves. SAR imaging offers a route to regular imaging of the Earth's surface uninterrupted by cloud cover.
An example of an EO platform obtaining SAR imaging of the Earth's surface are the European Space Agency Sentinel-1 satellites, operating C-band SAR. The European Space Agency Sentinel-2 satellites obtain imaging of the Earth's surface in several bands across the visible and infrared bands, approximately covering the 400-2300 nm spectral range. Other EC platforms exist and will continue to be developed to deliver SAR and visible-infrared imaging.
Ideally one would like to accurately predict the visible-infrared spectral response of the Earth's surface from SAR imaging alone so that the full range of remote sensing analytics that have been established in the visible-infrared bands (e.g. the NDVI, etc.) can be applied, even in the presence of cloud cover. However, since the radar reflectance and visible-infrared spectral response of any given patch of the Earth's surface is determined by different physics (e.g. absorption of certain frequencies of EM radiation in the visible bands by chlorophyll in leafy plants, versus reflectance and scattering of radio waves by leaf and stem surfaces), accurately translating SAR images to the corresponding visible-infrared images is not trivial. It is also for this reason that it is difficult to interpret SAR images directly to derive meaningful information about surface properties. For example, distinguishing a patch of bare soil within a pasture is straight-forward in visible imagery because bare soil has a characteristic brown colour compared to the green grass. In SAR imagery the radio/microwave reflectance (i.e. detected backscatter) properties of the bare soil and surrounding grass might be very subtle, and not distinguishable by eye or even a basic statistical analysis. Therefore it is desirable to seek methodology that can aid in the interpretation of, and extract information from, SAR imagery.
Current approaches that seek to exploit SAR for remote sensing are often focused on agricultural applications and include * Empirically calibrating radar backscatter (single or multi-frequency and single or multi-polarization, single-epoch or multi-epoch) to biophysical parameters, such as (e.g.) the Leaf Area Index or fresh biomass through (e.g.) regression analyses * Radiative transfer modelling of radar backscatter for different ground conditions/structures (e.g. a field of wheat) * Machine learning techniques that seek to predict (e.g.) single biophysical parameters such as Leaf Area Index, from radar backscatter measurements 15 The main drawback of these existing techniques is in their specificity (e.g. to a single biophysical parameter) and (especially in the case of radiative transfer modelling), their complexity.
Another technique being investigated is 'deep learning' to predict a visible-infrared image or images from SAR imaging.
Berkeley Al Research (BAIR) Laboratory have undertaken some work in relation to the use of a conditional Generative Adversarial Network (cGAN) for image to image translation with a paper authored by Phillip Isola, Jun-Yan Zhu, Tinghui Zhou and Alexei A. Efros "Image-to-Image Translation with Conditional Adversarial Nets" t1-h,0 1. This paper investigates the use of a cGAN as a general-purpose solution to image-to-image translation problems. These networks not only learn the mapping from input image to output image, but also learn a loss function to train this mapping. This makes it possible to apply the same generic approach to problems that traditionally would require very different loss formulations. These loss functions would previously need to be hand coded. Manually defining a loss function that accurately describes the accuracy of an image reproduction is an open problem. The paper discusses that the Pix2Pix cGAN approach is effective at synthesizing photos from label maps, reconstructing objects from edge maps, and colourizing images, among other tasks.
The paper sets out that GANs are generative models that learn a mapping from random noise vector z to output image y, G: z y. In contrast, cGANs learn a mapping from observed image x and random noise vector z, to y, G: {x, z} y. In Isola et al. (2016) noise is included via dropout [Srivastava et al. 2014]. The generator G is trained to produce outputs that cannot be distinguished from "real" images by an adversarially trained discriminator, D, which is trained to do as well as possible at detecting the generator's "fakes".
The objective of a cGAN can be expressed as LzGAN (G, D) =Ex,y[log D(x, y)]+Ex,z[log(1 -D(x, G(x, z))] where G tries to minimize this objective against an adversarial D that tries to maximize it, i.e. G* = arg minG maxo LcGAN (G, D).
To test the importance of conditioning the discriminator, in the paper they also compare to an unconditional variant in which the discriminator does not observe x: LGAN (G, D) =Ey[log D(y)]+Ex z[log(1 -D(G(x, z))] Previous approaches have found it beneficial to mix the CAN objective with a more traditional loss, such as L2 distance. This incentivises G to produce images that not only "fool" D, but that are also close quantitatively to the ground truth.
Li_1(0) = Ex,y,[ y -G(x, z) ] And as such their final objective is G* = arg minG maxo LcGAN (G, D) + ALL, (G) An application created that uses this method is called Pix2Pix, and is the basis for much work in this area in relation to translation of images of one type to another type such as from a satellite image to a Google (RTM) map image.
"A Conditional Generative Adversarial Network to Fuse SAR And Multispectral Optical Data For Cloud Removal From Sentinel-2 Images"
S e et:
civegiarjai.Net rnovai From Seri 31-2 imaassi uses a cGAN architecture. An input SAR image is combined with a "corrupted" Multispectral (MS) image (with semi-transparent cloud cover). Therefore, this method does not work on imagery with complete cloud cover impermeable to visible light. This method is typically used for de-hazing of MS images with some but not complete cloud cover and requires a combination of SAR and MS images in order to generate an output. However, the method is able to retrieve the full spectrum of Sentinel-2 imagery.
The SEN1-2 Dataset For Deep Learning In SAR-Optical Data Fusion' [hitos://arx iv.orq/pdfi 1 807. 0-15; xii] presents a GAN-based technique for colourizing and producing artificial optical imagery from SAR data. This method uses SAR -Google (RTM) Earth image pairs to train a Pix2Pix GAN. However, as the image pairs that they employ are not temporally correlated Google (RTM) Earth/SAR image pairs, this results in anomalously coloured fields. Furthermore, this method is restricted to predicting RGB imagery.
"Generating High Quality Visible Images from SAR Images Using CNNs" [https://arxiv.ora/pdf ocif] uses a GAN cascade to firstly remove "speckling" from SAR imagery, and then to colourize the resultant cleaned SAR imagery. Again, since the image pairs that they employ are not temporally correlated SAR-Google (RTM) Earth image pairs, this results in anomalously coloured fields. Also this method only predicts RGB bands, not all bands of Sentinel-2 data, which also cover the Red Edge, Near IR (NIR) and Short Wave IR (SWIR).
"Exploiting GAN-based SAR To Optical Image Transcoding For Improved Deep Learning Classification" [1-: etiplorQieg±.Loigjci2cun.len04..222] uses temporally correlated SAR/RGB-NIR image pairs to train a conditional GAN. First the SAR imagery is colourized, and then the resulting data is passed through to a semantic segmentor Convolutional Neural Network (CNN). The final result is a semantically segmented image. The authors note problems with realistic colour retrieval. There are also issues with fine-detail retrieval. Again this method does not retrieve the full spectrum, only RGB-NIR.
S
Summary of the Invention
According to a first aspect of the present invention there is provided a method of creating a mapping model for translating an input image to an output image, the method comprising obtaining an ensemble of training data T comprising a sample of pairs of matched images [R,V], providing a neural network and training the neural network with the training data T to obtain the mapping model V* = f(R) that translates input image R to output image V* where V* is equivalent to V in a flawless mapping.
Preferably the training data T comprises a plurality of real matched images [R,V].
Preferably the neural network comprises a generator and a discriminator.
Preferably the training comprises: 1) propagating R into the generator, wherein the generator produces V* which represents a "fake" version of V based on a transformation of R 2) associating V* with R to form new matched pair [R,V*] 3) propagating [R,V*] into the discriminator to determine the probability that V* is "real", wherein the probability that V* is "real" is estimated from a loss function that encodes the quantitative distance between V and V" 4) backpropagating the error defined by the loss function through the neural network.
Preferably there are N iterations of training steps 1 to 4 wherein T is sampled at each iteration.
In one alternative the loss function is learnt by the neural network.
In another alternative the loss function is hard-coded.
In a further alternative the loss function may be a combination of hard-coding and learning by the neural network. Preferably the loss function is a combination of a learnt GAN loss, and the Least Absolute Deviations (L1) loss, with the L1 loss weighted at a fraction the GAN loss, in one alternative the L1 loss is weighted at 0.01 x the GAN loss.
Preferably each image in R and V is normalised.
Preferably the normalisation comprises a rescaling of the input values to floating point values in a fixed range, in one alternative the normalisation comprises a rescaling of the input values to floating point values in the range of 0 to 1.
Preferably the neural network comprises an encoder-decoder neural network, more preferably the neural network comprises a conditional GAN, even more preferably the neural network comprises a fully convolutional conditional GAN.
The advantage of using a conditional GAN is that the neural network not only learns the mapping from input image to output image, but also learns the loss function to train this mapping. The advantage of using a fully convolutional conditional GAN is that in addition the input images R do not all need to be of the same size and can be of different sizes to each other.
Preferably the backpropagation of the error defined by the loss function updates the weights in the neural network so that they follow the steepest descent of the loss between V and V*.
Preferably R comprises at least one SAR image, encoded as a data matrix.
Preferably V comprises at least one image in the visible-infrared spectral range, encoded as a data matrix.
Preferably the visible-infrared spectral range is between about 400-2300 nanometres 25 (nm).
Preferably R is of size m x n of a patch of the Earth's surface spanning a physical region p x q.
Preferably V is of size m x n of a patch of the Earth's surface spanning a physical region p x q.
Preferably V is of size m x n at one or more frequencies across the visible-infrared spectral range.
S
Preferably V* is of size m x n of a patch of the Earth's surface spanning a physical region p x q.
Preferably V* is of size m x n at one or more frequencies across the visible-infrared spectral range.
In one alternative where there are a plurality of images R they are all recorded at a single radar frequency In another alternative where there are a plurality of images R they are recorded at multiple frequencies. For example, L-band + C-band.
In one alternative where there are a plurality of images R they are all recorded at a single polarisation. For example, vertical-vertical.
In another alternative where there are a plurality of images R they are recorded at multiple polarisations. For example, vertical-vertical and vertical-horizontal.
In another alternative where there are a plurality of images R they are recorded at different detection orientations/incident angles. For example, they could be taken by different satellites on different orbits.
Preferably R further comprises additional information representing prior knowledge about the region of interest or the observing conditions of V and/or R. Preferably the additional information includes but is not limited to a map of the surface elevation; a previously observed unobscured view in each visible-infrared spectral band; a map of the location of each pixel; time of year; and sun elevation/azimuth angle information.
Preferably the additional information is selected from one or more of: a map of the surface elevation; a previously recorded unobscured view in each visible-infrared spectral band; a map of the location of each pixel; time of year; and sun elevation/azimuth angle information.
According to a second aspect of the present invention there is provided an imaging apparatus for creating a mapping model for translating an input image to an output image as set out in the first aspect of the present invention.
According to a third aspect of the invention there is provided a method of translating an input image R to an output image V*, the method comprising obtaining a mapping model for translating an input image to an output image as set out in the first aspect of the present invention inputting a new image R into the mapping model wherein the mapping model translates input image R and outputs image V*.
Preferably the input R comprises at least one SAR image, encoded as a data matrix.
Preferably the output V* comprises at least one image in the visible-infrared spectral range, encoded as a data matrix.
Preferably the visible-infrared spectral range is between about 400-2300 nanometres (nm).
Preferably the input image R is of size m x Preferably the input image R is of size mx n of a patch of the Earth's surface spanning a physical region p x q.
Preferably the output image V* is of size m x n of a patch of the Earth's surface spanning a physical region p x q.
Preferably the output image V* is of size m x n at one or more frequencies across the visible-infrared spectral range.
In one alternative where there are a plurality of input images R they are all recorded at a single radar frequency In another alternative where there are a plurality of input images R they are recorded at multiple frequencies. For example, L-band + C-band.
In one alternative where there are a plurality of input images R they are all recorded at a single polarisation. For example, vertical-vertical.
In another alternative where there are a plurality of input images R they are recorded at multiple polarisations. For example, vertical-vertical and vertical-horizontal.
In another alternative where there are a plurality of input images R they are recorded at different detection orientations/incident angles. For example, they could be taken by different satellites on different orbits with different view angles.
Preferably R further comprises additional information representing prior knowledge about the region of interest or the observing conditions.
Preferably the additional information includes but is not limited to a map of the surface elevation; a previously observed unobscured view in each visible-infrared spectral band; a map of the location of each pixel; time of year; and sun elevation/azimuth angle information.
Preferably the additional information is selected from one or more of: a map of the surface elevation; a previously observed unobscured view in each visible-infrared spectral band; a map of the location of each pixel; time of year; and sun elevation/azimuth angle information.
According to a fourth aspect of the present invention there is provided a method of predicting the visible-infrared band images of a region of the Earth's surface that would be observed by an EO satellite or other high-altitude imaging platform, using data from SAR imaging of the same region.
Preferably the method is used to predict images of the Earth's surface in the visible-infrared bands when the view between an imaging instrument (e.g. a camera) and the ground is obscured by cloud or some other medium that is opaque to EM radiation in the visible-infrared spectral range, approximately spanning approximately 400-2300 nanometres (nm), but transparent to EM radiation in the radio-/microwave part of the spectrum.
Preferably the method used is set out in the third aspect of the present invention.
According to a fifth aspect of the present invention there is provided a method of producing a predicted visible-infrared band image of a region of the Earth's surface that would be observed by an EO satellite or other high-altitude imaging platform, using data from SAR imaging of the same region.
Preferably the method is used to produce a predicted image of the Earth's surface in the visible-infrared bands when the view between an imaging instrument (e.g. a camera) and the ground is obscured by cloud or some other medium that is opaque to EM radiation in the visible-infrared spectral range, approximately spanning approximately 400-2300 nanometres (nm), but transparent to EM radiation in the radio-/microwave part of the spectrum.
Preferably the method used is set out in the third aspect of the present invention.
According to a sixth aspect of the present invention there is provided an imaging apparatus for translating an input image R to an output image V* as set out in the third aspect of the present invention.
According to a seventh aspect of the present invention there is provided an imaging apparatus for translating an input image R to an output image V* as set out in the fourth aspect of the present invention.
According to an eighth aspect of the present invention there is provided a method of generating a new set of images V+ at any frequency in the range approximately spanning 400-2300nm from V*.
Preferably the method comprises the following steps: a) considering a pixel at coordinate (x,y) in each image in V*, wherein V* can be considered a set of images W = [VO, V1, V2, VN] wherein each image corresponds to an observed bandpass at some average wavelength of EM radiation and wherein the set of wavelengths associated with each image is lambda = [lambda0, lambda1, lambda2 lambdaN]; b) assuming a function S(x,y,lambda,p) represents the continuous spectral response of the Earth surface, where p are a set of parameters. S is described by Equation 1 and p represents 6 free parameters; c) finding p for each pixel (x,y) by fitting the function S(x,y,lambda,p) to (lambda,V*) d) creating a new set of images V+ covering the same region as V* by applying S(x,y,lambda,p) for any given wavelength lambda Loo(i + exp(-01 p2))I -1 -4-pij X exp(-p4(A/1500nm)) 4-p5 exp(---(A ----0)2/292) Equation 1 Preferably the emission can be calculated at a particular value of lambda to predict the emission at particular wavelength of EM radiation.
Alternatively, convolve the continuous spectrum S with arbitrary bandpass response r(lambda) which will have an effective or average wavelength.
In one alternative the Gaussian width g and centre c are variable, in another alternative g=20nm and c=560nm.
In one alternative step c) is carried out using least squares minimization.
The V+ images can be stored in any format convenient for onward analysis (e.g. GeoTIFF), including any relevant georeferencing metadata.
The algorithm can reliably predict images of the Earth's surface at any wavelength across the visible to infrared spectral range (wavelengths spanning approximately 400-2300 nm) using SAR imaging.
According to a ninth aspect of the present invention there is provided a method of deriving meaningful indicators of surface conditions by producing a predicted visible-infrared band image of a region of the Earth's surface that would be observed by an EO satellite or other high-altitude imaging platform, using data from SAR imaging of the same region comprising the steps of a) creating a mapping model for translating an SAR image to a visible-infrared band image, the method comprising obtaining an ensemble of training data T comprising a sample of pairs of matched images [R,V], providing a neural network and training the neural network with the training data T to obtain the mapping model V* = f(R) that translates SAR image R to visible-infrared band image V" where V" is equivalent to V in a flawless mapping; and b) inputting a new SAR image R into the mapping model wherein the mapping model translates the new SAR image R to produce visible-infrared band image V*.
Preferably R further comprises additional information representing prior knowledge about the region of interest or the observing conditions of V and/or R. Preferably the additional information includes but is not limited to a map of the surface elevation; a previously observed unobscured view in each visible-infrared spectral band; a map of the location of each pixel; time of year; and sun elevation/azimuth angle information.
Preferably the additional information is selected from one or more of: a map of the surface elevation; a previously recorded unobscured view in each visible-infrared spectral band; a map of the location of each pixel; time of year; and sun elevation/azimuth angle information.
According to a tenth aspect of the present invention there is provided a system for deriving meaningful indicators of surface conditions by producing a predicted visible-infrared band image of a region of the Earth's surface that would be observed by an EO satellite or other high-altitude imaging platform, using data from SAR imaging of the same region comprising: a) a mapping model for translating an SAR image to a visible-infrared band image, the method comprising obtaining an ensemble of training data T comprising a sample of pairs of matched images [R,V], providing a neural network and training the neural network with the training data T to obtain the mapping model V* = gm that translates SAR image R to visible-infrared band image V* where V* is equivalent to V in a flawless mapping stored on a non-transitory tangible computer readable storage medium; and b) inputting a new SAR image R into the mapping model wherein the mapping model translates the new SAR image R to produce visible-infrared band image V*.
Preferably R further comprises additional information representing prior knowledge about the region of interest or the observing conditions of V and/or R. Preferably the additional information includes but is not limited to a map of the surface elevation; a previously observed unobscured view in each visible-infrared spectral band; a map of the location of each pixel; time of year; and sun elevation/azimuth angle information.
Preferably the additional information is selected from one or more of: a map of the surface elevation; a previously recorded unobscured view in each visible-infrared spectral band; a map of the location of each pixel; time of year; and sun elevation/azimuth angle information.
According to an eleventh aspect of the present invention there is provided a non-transitory tangible computer readable storage medium having stored thereon a computer program for implementing a method of producing a predicted visible-infrared band image of a region of the Earth's surface that would be observed by an EO satellite or other high-altitude imaging platform, using data from SAR imaging of the same region comprising the steps of a) creating a mapping model for translating an SAR image to a visible-infrared band image, the method comprising obtaining an ensemble of training data T comprising a sample of pairs of matched images [R,V], providing a neural network and training the neural network with the training data T to obtain the mapping model V* = f(R) that translates SAR image R to visible-infrared band image V" where V" is equivalent to V in a flawless mapping; and b) inputting a new SAR image R into the mapping model wherein the mapping model translates the new SAR image R to produce visible-infrared band image V. Preferably R further comprises additional information representing prior knowledge about the region of interest or the observing conditions of V and/or R. Preferably the additional information includes but is not limited to a map of the surface elevation; a previously observed unobscured view in each visible-infrared spectral band; a map of the location of each pixel; time of year; and sun elevation/azimuth angle information.
Preferably the additional information is selected from one or more of: a map of the surface elevation; a previously recorded unobscured view in each visible-infrared spectral band; a map of the location of each pixel; time of year; and sun elevation/azimuth angle information.
According to a twelfth aspect of the present invention there is provided a non-transitory tangible computer readable storage medium having stored thereon a computer program for implementing a method of deriving meaningful indicators of surface conditions by producing a predicted visible-infrared band image of a region of the Earth's surface that would be observed by an ED satellite or other high-altitude imaging platform, using data from SAR imaging of the same region comprising the steps of a) creating a mapping model for translating an SAR image to a visible-infrared band image, the method comprising obtaining an ensemble of training data T comprising a sample of pairs of matched images [R,V], providing a neural network and training the neural network with the training data T to obtain the mapping model V* = f(R) that translates SAR image R to visible-infrared band image V" where V" is equivalent to V in a flawless mapping; and b) inputting a new SAR image R into the mapping model wherein the mapping model translates the new SAR image R to produce visible-infrared band image V".
Preferably R further comprises additional information representing prior knowledge about the region of interest or the observing conditions of V and/or R. Preferably the additional information includes but is not limited to a map of the surface elevation; a previously observed unobscured view in each visible-infrared spectral band; a map of the location of each pixel; time of year; and sun elevation/azimuth angle information.
Preferably the additional information is selected from one or more of: a map of the surface elevation; a previously recorded unobscured view in each visible-infrared spectral band; a map of the location of each pixel; time of year; and sun elevation/azimuth angle information.
This Invention presents a general method to produce predicted visible-infrared images observed by a given EO satellite (or other reconnaissance platform that might be affected by intervening obscuration) using SAR imaging alone. This allows for a wide range of remote sensing predictions to be made using the full visible-infrared spectral range even in the presence of cloud cover or some other obscuring medium, provided the medium is transparent to EM radiation in the radio-/microwave part of the spectrum.
Rather than 'colourizing' a SAR image (i.e. assigning spectral information to pixels in a monochromatic image), the presented method produces images by fully predicting the visible-infrared spectral response pixel-by-pixel, thus is not biased to the resolution of the input SAR imagery or its ability to capture fine surface detail.
Advantageously this method can produce predicted visible-infrared spectral response images using SAR imagery alone, offering predictive power where obscuration is 100%. Other approaches combine SAR and visible imaging to 'dehaze' images, e.g. when cloud cover is not 100% opaque to visible and infrared photons.
Advantageously this method takes as input either one SAR input for a given target region (e.g. an image at a single frequency, polarization and orientation) or multiple inputs (e.g. multiple images at various frequencies, polarizations and orientations).
Advantageously this method can incorporate additional 'prior' information about a particular geographic region (for example, surface elevation mapping, sun angle information at the time of observation, and/or previous unobscured views). This improves the predictive power of the algorithm.
Advantageously this method can retrieve all spectral information present in the training set, spanning a wide range of frequencies from the visible to infrared. It is also possible to generate 'synthetic' images at any intermediate frequency not necessarily represented by the training set. This makes it possible to generate/predict output images that would be observed by a different instrument/filter.
Brief Description of the Drawings
Other aspects and features of the present invention will become apparent to those ordinarily skilled in the art upon review of the following description of specific embodiments of the invention in conjunction with the accompanying figures.
Figure 1 illustrates a schematic view of the general solution algorithm of the present invention; Figure 2 illustrates a (top) preferred generator, (middle) preferred injection, and (bottom) preferred discriminator subroutines; and Figure 3 illustrates (top) preferred downsampling residual block, (bottom) preferred upsampling residual block.
Detailed Description of Preferred Embodiments
The invention is an algorithmic (neural network) pipeline that takes as input one or more SAR data matrices (images) each of size m x n of a patch of the Earth's surface spanning a physical region p x q. The algorithm predicts the corresponding image(s) of size m x n at one or more frequencies across the visible-infrared bands.
We call the ensemble of input images of a given region R. We call the ensemble of output images of a given region V*. The input image(s) R could be at a single radar frequency, or multiple frequencies (e.g. [-band + C-band), and at a single polarization (e.g. vertical-vertical) or multiple polarizations (e.g. vertical-vertical and vertical-horizontal). They could also be obtained at different detection orientations/incident angles (e.g. obtained by different satellites on different orbits). Optionally, R can be supplemented by additional information representing prior knowledge about the region of interest (e.g. a map of the surface elevation and/or sun angle information at the time of observation and/or a previously measured unobscured view in each band in V*).
Given a pair [R,V], where V represents the direct unobscured view of a particular region in the visible and infrared bands, it is assumed there exists a mapping V = gR) that translates R to V. The algorithm determines f through a training process. After training, the algorithm can use f to translate new inputs R to outputs V*. These outputs V* represent the prediction of the unobscured view V across the visible and infrared bands given only the information in R, where V = V* represents a flawless mapping.
It is expected that the training and input data will be suitably calibrated / normalised, for example the SAR data will represent detected radio / microwave backscatter reflectance and the visible-infrared data will represent Top Of Atmosphere Reflectance or Bottom of Atmosphere Reflectance values. However, in principle the exact calibration of the data is arbitrary, provided it is consistent across training data and new inputs.
To find the mapping function f, the algorithm attempts to minimise the difference between V and f(R). Training involves: 1. Assembling pairs of image ensembles [R,V] where V contain images free from cloud cover or other obscuration. R and V could be sourced from different imaging platforms (e.g. different satellites) but are matched in terms of area coverage such that each [R,V] pair covers the same physical area p x q. They need not be at identical spatial resolution. Generally each pair in the training set will cover a different geographic region, but pairs could also overlap in coverage.
2. Incrementally adjusting the mapping function N times, reducing the difference between V and f(R) slightly with each increment. Each adjustment changes some weights in f, which moves the output KR) closer to the ground truth V. This incremental adjustment continues such that the quantitative difference (for example, as defined by the pixel-wise sum of the absolute difference) between V and f(R) is minimized.
Optionally, to improve the quality of the predicted images, a filtered representation of the unobscured image in each of the visible-infrared bands represented by V can be injected during training. This image represents prior knowledge about the region of interest, for example the last-measured unobscured image of that region, or a median-averaged masked stack of all (or a sub-set of) images of a given region to date or within recent time. These optional injected images are spatially filtered to remove low spatial frequency information, leaving high spatial frequency information that encodes fine detail about the scene. One filtering approach is to apply an unsharp mask, whereby the image is convolved with a Gaussian kernel and the convolved (smoothed) image subtracted from the original. However, other filtering techniques are possible.
After training, the model describing the mapping V = f(R) is stored on disk or other computer readable medium as a data object, and fully describes the mapping function as a transformation matrix. The model can be loaded into memory and a new input R can then be presented to the function, which will apply f(R) to produce new outputs V".
If the training data samples a range of wavelengths over the visible-to-infrared spectral range (e.g. as is the case for multi-band Sentinel-2 imagery), it is possible to derive images at any arbitrary wavelength across the range approximately spanning 400-2300nm using an interpolation function. This makes it possible to predict imagery that would be obtained by an arbitrary detector (e.g. another satellite platform) in the visible-infrared range.
General solution A schematic view of the general solution algorithm is shown in Figure 1. The following describes the main steps.
Consider an ensemble of training data T comprising a set of 'real' [R,V] pairs. Each [R,V] pair represents imagery of particular geographic region. Preferably the data in R and V in a given pair would be observed at the same time, but realistically they will be observed at slightly different times. Ideally each [R,V] pair will be assembled such that the SAR images and visible-infrared images are taken close together in time as is feasible. Importantly, R can also include non-SAR information, such as a digital elevation model or measurement, sun elevation angle or time of year. Other information could also be included. Each image in V is an unobscured (e.g. zero cloud cover, low cirrus) image of the region. The calibration of each data component is in principle arbitrary, but should be consistent across the training data. For example, all input images in V representing the red band could be normalised to values of 0-10000. The same calibration should apply to all red images processed by the algorithm.
Training involves a series of N iterations whereby T is sampled at each iteration. The sampling can pick a single pair of [R,V] from T or a 'batch' of M pairs of [R,V] from T. At each iteration the algorithm proceeds as follows, processing either a single pair [R,V] or batch of pairs of [R,V]: 1. Each data matrix in R is normalised and passed to the 'generator'. The purpose of the generator is to produce a generated (or 'fake') set of data based on a transformation of R: a. R propagates through a neural network. This network consists of one or more 'layers' of artificial neurons. An artificial neuron is effectively a mathematical function that transforms an input to an output by multiplication by a weight and application of an 'activation function'. The layers of neurons act like a transformation matrix on the input, R. Each neuron layer initially comprises of a set of randomised weights, which multiply subsections of the incoming matrix R. This step produces an output V* which represents a 'fake' or generated version of V based on R. b. Optionally, R can contain channels with additional known prior information. For example, a channel could contain a surface elevation map. An image with all pixels set to the same value could be used to encode information shared by the entire image, for example, the average sun elevation angle, or the number of days since January 1 Si of that year at the time of observation.
c. Optionally, images representing an estimate of the unobscured surface (e.g. a cloud-free median stack from archival visible-infrared data) in each band represented by V are filtered and injected into the network. One purpose of the filtering could be to remove colour information and low spatial frequency information from the data. The filtered images F are summed with the corresponding image in V. d. Optionally, a final set of neuron layer(s) are applied to blend F and V* to produce the generator output.
2. The generator output(s) V* are concatenated with the corresponding input(s) R to form a new pair [R,V*]. This is the 'fake' or generated data. This data is passed to the 'discriminator'.
3. The discriminator estimates the probability that V* is an example from the real set of data T. The probability is estimated from a loss function that encodes the quantitative distance between V and V". The loss function itself could be learnt by the neural network, or could be hard-coded, or could be a combination of the two. For example, a possible loss function could be the sum of the squared differences between pixels in V and V", but other loss functions are feasible.
4. Backpropagation is used to update the network weights so that they follow the steepest descent of the loss (or prediction 'error') between V and V*. Backpropagation is the technique of updating the network weights back through layers of a neural network through gradient descent, where the gradient refers to the gradient of the loss function and is normally evaluated through auto-differentiation. By descending down the gradient, the algorithm seeks the minimum of the loss function. Due to the architecture of a neural network, the weights in successive layers can be updated following the chain rule. After updating, the performance of the discriminator in classifying a given image as being sampled from the real data T or is an output of the generator is improved.
5. The weights of the generator are updated to increase the probability that the generator output is misclassified by the discriminator as being from the real data T. 6. [R,V] is passed to the discriminator. The discriminator estimates the probability that [R,V] is sampled from the generated set of [R,V*] through the same loss function as (3). The loss, or prediction 'error' is backpropagated through the network, and the weights of the discriminator are updated to follow the steepest descent of the loss function.
7. The weights of the discriminator are updated to improve the probability (i.e. reduce the loss) that it correctly classifies [R,V] as being sampled from the real data T and [R,V*] as being a product of the generator.
8. The outputs of the generator can be retrieved at any point during or after training.
After training is complete the network can be stored on disk or other computer readable medium as a transformation matrix encoding all weight information. This is referred to as the 'model'. The model can now accept a new input R and generate an associated output V* through step 1. V* represents the prediction for the unseen (e.g. obscured) V that corresponds to R. The images comprising V* are de-normalised to produce images calibrated in an identical manner to the training set. The images can be stored in any format convenient for onward analysis (e.g. GeoTIFF), including any relevant georeferencing metadata Preferred Solution Assuming training data T is assembled into an ensemble of [R,V] pairs as described above, where each R in a preferred example comprises: 1. A C-band (approx. 5.4GHz) VH (vertical transmit-horizontal receive) cross-polarised SAR image observed on an ascending polar orbit, spanning a physical region 10.24x10.24km with a pixel scale of 10m/pix. Preferably the SAR data is projected to ground range using an Earth ellipsoid model.
Preferably the SAR image is corrected for thermal noise. Preferably radiometric calibration is applied. Preferably orthorectification is applied.
2. A C-band (approx. 5.40Hz) VH (vertical transmit-horizontal receive) cross-polarised SAR image observed on a descending polar orbit, spanning a physical region 10.24x10.24km with a pixel scale of 10m/pix. Preferably the SAR data is projected to ground range using an Earth ellipsoid model.
Preferably the SAR image is corrected for thermal noise. Preferably radiometric calibration is applied. Preferably orthorectification is applied.
3. A C-band (approx. 5.40Hz) VV (vertical transmit-vertical receive) like-polarised SAR image observed on an ascending polar orbit, spanning a physical region 10.24x10.24km with a pixel scale of 10m/pix. Preferably the SAR data is projected to ground range using an Earth ellipsoid model. Preferably the SAR image is corrected for thermal noise. Preferably radiometric calibration is applied. Preferably orthorectification is applied.
4. A C-band (approx. 5.4GHz) VV (vertical transmit-vertical receive) like-polarised SAR image observed on a descending polar orbit, spanning a physical region 10.24x10.24km with a pixel scale of 10m/pix. Preferably the SAR data is projected to ground range using an Earth ellipsoid model. Preferably the SAR image is corrected for thermal noise. Preferably radiometric calibration is applied. Preferably orthorectification is applied.
5. An image encoding the surface elevation interpolated onto the same pixel grid as the SAR images 1-4.
6. An image with an identical pixel grid to images 1-5 with pixels set to a single value representing the average time of observations 1-4, defined as the number of days since January Pt divided by 365.
The training set T is assembled in a preferred example as follows: 1. Identify a cloud-free and preferably low cirrus image spanning 10.24x10.24km of the Earth's surface in one or more bands across the visible-infrared spectral range, e.g. (R, G, B, NIR) with a pixel scale of 10m/pix. Preferably images in these bands will be recorded at the same time, t, for a given region. Preferably each band will represent Bottom Of Atmosphere reflectance values. These 2. For the V defined in 1 construct the corresponding R as above, where each SAR image 1-4 represents the median of all corresponding SAR images for the same physical region covered by V recorded within plus or minus 3 days of t 3. Repeat 1 & 2 to assemble a large number (preferably 1000s) of [R,V] pairs to form T. Preferably these will include different geographic regions, and for a given region, multiple observations recorded at different times of year. Having assembled T, the preferred training algorithm is described as follows: 1. Each image in R is normalised and cropped to a random area. The crop is then passed to the 'generator'. The preferred generator is an 'encoder-decoder' neural network. The purpose of the generator is to produce a generated (or 'fake') version of V based on a transformation of R: a. R propagates through the generator, as shown in Figure 2. The generator consists of five downsampling residual blocks, as shown in Figure 3. A 'bottleneck' is then applied, consisting of three convolutional layers. A convolutional layer is a collection of learnt filters that comprise of a n by n set of neurons. The filters convolve over the layer input, producing a feature map. This feature map is essentially a measure of the number of detected learnt features in the input matrix.
The initial bottleneck layer downsamples the incoming matrices by a factor of two along the spatial dimensions. A final bottleneck layer deconvolves, or upsamples, the matrices by a factor of two. Five upsampling residual blocks (Figure 3) are then applied. Each downsampling or upsampling block downsamples or upsamples the spatial image axes by a factor of two while simultaneously increasing or decreasing the number of filters (feature maps) by a factor of two. The preferred method to downsample by a factor of n is to apply a convolutional layer with a pixel stride of n. This increases the spatial distance between convolutional filter samples by a factor of n, reducing the feature map size by a factor of n. Other downsampling methods could also be used. The preferred method to upsample by a factor of n is to apply a transpose convolution (deconvolufion) layer with a pixel stride of n. This is equivalent to convolving over an image with a padding of zero valued pixels around each pixel. Other upsampling methods could also be used. Rectified Linear Unit (ReLU) activation is applied after each convolutional layer. A ReLU is a mathematical function that returns the positive part of its argument: f(x) = max(0,x). b. Ideally (but optionally) R can contain channels in addition to the SAR image channels. These additional channels encode other known information. For example, a surface elevation map could be included.
Information encoded by a single variable (e.g. days elapsed in the year at the time of observation, or average sun elevation angle) could be included by appending a channel map with all pixels set to the value of the single variable.
c. Ideally (but optionally), for each region under consideration (in training and in prediction) we construct a corresponding 'rolling prior' image in each of the bands comprising V using archival imaging. Every pixel in each rolling prior image is evaluated as the median value of the last P unobscured (e.g. cloud-free) views of the region subtended by that pixel in each band. We adopt P=15, but this parameter could be varied.
The objective is to construct a statistically robust unobscured image that encodes up-to-date knowledge of the surface detail of a given geographic region, observed prior to the current observation (which might be obscured). The rolling prior image is filtered by subtracting a Gaussian-convolved matrix from the original matrix from to produce F. The Gaussian kernel width can be varied appropriately taking into account the spatial resolution of the images comprising V. These filtered images F are summed with the corresponding image in V".
d. Ideally (but optionally), a residual block, and a convolutional layer with a sigmoid activation (Figure 2) are applied to blend F and V* and produce the generator output.
2. The generator output(s) V" are concatenated along the channel axis with the corresponding input(s) R to form a new pair (or pairs, in the case of batch sampling) [R,V1. This is the 'fake' or generated data. This data is passed to the 'discriminator'.
3. The discriminator estimates the probability that V" is an example from the real set of data T. The preferred discriminator architecture is described in Figure 2 and comprises four layer sets. Each layer set begins with a downsample convolution, followed by two convolution layers. All convolution layers are followed by a ReLU activation. After these layer sets, global average pooling is applied, and the result is densely connected to a binary sigmoid activation output. After propagation of V* through the discriminator, the probability is estimated from a loss function that encodes the quantitative distance between V and V*. The loss function is preferably a combination of a learnt GAN loss, and the Least Absolute Deviations (L1) loss, with the L1 loss weighted at 0.01x the GAN loss.
4. Backpropagation is used to update the network weights so that they follow the steepest descent of the loss (or prediction 'error') between V and V*. After updating the weights, the discriminator can better classify if a given image as being sampled from the real data T or is an output of the generator.
5. The weights of the generator are updated to increase the probability that the generator output is misclassified by the discriminator as originating from the real data T. 6. [R,V] is passed to the discriminator. The discriminator estimates the probability that [R,V] is sampled from the generated set of [R,V*] through the GAN+L1 combination loss function. The loss, or prediction 'error' is backpropagated through the network, and the weights of the discriminator are updated to follow the steepest descent of the loss function.
7. The weights of the discriminator are updated to improve the probability (i.e. reduce the loss) that it correctly classifies [R,V] as being sampled from the real data T and [R,V*] as being a product of the generator.
8. The outputs of the generator can be retrieved at any point during or after training.
After training is complete the network can be stored including on a computer readable medium such as a disk as a transformation matrix encoding all weight information.
This is referred to as the 'model'. The model can now accept a new input R and generate an associated output V* image through step 1. V* represents the prediction for the unseen (e.g. obscured) V that corresponds to R. The outputs are de-normalised to produce images calibrated in an identical manner to the training set.
The images can be stored in any format convenient for onward analysis (e.g. GeoTIFF) including any relevant georeferencing metadata.
The images represented by the prediction V* will be functionally equivalent to the V images in T -i.e. the same set of observed bands. Each band is characterised by a bandpass centred at a given wavelength in the visible-infrared spectral range. These output images could be analysed as is'. Optionally, using the output images V* it is possible to generate a new set of images V+ at any frequency in the range approximately spanning 400-2300nm using an analytic interpolation function: a) Consider a pixel at coordinate (x,y) in each image in V* b) V* can be considered a set of images V* = [VO, V1, V2, VN]. Each image corresponds to an observed wavelength (or more generally, the response in a given bandpass, encoded by a function r(lambda), where lambda is the wavelength of EM radiation, resulting in an effective or average wavelength for the band). Call the set of wavelengths associated with each image lambda = [lambda°, lambda1, lambda2 lambdaN].
c) Assume a function S(x,y,lambda,p) represents the continuous spectral response of the Earth's surface, where p are a set of parameters. S is described by Equation 1. p represents 6 free parameters. The Gaussian width g and centre c could be variable, but g=20nm and c=560nm can be fixed to provide adequate fits to archival data.
d) Find p for each pixel (x,y) by fitting the function S(x,y,lambda,p) to (lambda, V*) -e.g. through least squares minimization.
e) Having determined p(x,y), create a new set of images V+ covering the same region as V* by applying S(x,y,lambda,p) for any given wavelength lambda. Alternatively, convolve the continuous spectrum S with arbitrary bandpass response r(lambda).
S(A) [NCI P2)))-' +P31 x exP(-P4(A/1500niti)) +p5 exP(-(A -c)2/2g2) Equation 1 The V+ images can be stored in any format convenient for onward analysis (e.g. GeoTIFF), including any relevant georeferencing metadata.
The algorithm can reliably predict images of the Earth's surface at any frequency/wavelength across the visible to infrared spectral range (wavelengths spanning approximately 400-2300 nm) using SAR imaging.
The level of confidence of the predicted images (e.g. the 68% confidence interval of a given pixel in a given band) can be estimated from the training data through a ground truth / prediction validation exercise.
The performance of the algorithm can be improved by including prior information about the region of interest and observation, for example, surface elevation data, sun angle information, date and time of observation, or previously observed surface detail in the bands of interest.
The algorithm can be used to 'in-fill' regions of visible-infrared band images affected by cloud or cloud shadow (or other obscuration or corruption or missing data), or to generate entire images of a given region if the obscuration is complete (e.g. 100% cloud cover).
The output images can be analysed in the same manner as images directly observed in the visible-infrared bands.
The output images can be used individually (e.g. single-band) or in combination (multi-band) to derive meaningful indicators of surface conditions. These could be related to the presence or absence of water, soil properties, signatures of drought, signatures of over-grazing by cattle, or the presence, density and health of vegetation. Some examples of analysis products derived from the predicted visible-infrared imaging related to agricultural monitoring include, but are not limited to: o Normalized Difference Vegetation Index o Enhanced Vegetation Index o Normalized Difference Water Index o Soil-adjusted Vegetation Index The full spectral response across the visible-infrared bands can be used in many ways to determine surface conditions and properties, and the present Invention allows these techniques to be used. For example, the yield of a particular crop could be estimated by a function of the full visible-infrared spectral response. The spectral response can be mapped to physical parameters (e.g. biomass) through ground truth validation.
Output images obtained at different times can reveal changes in land use or surface properties, including but not limited to: o Roughing or cultivation or change of use of a field o Forestation/deforestation o Harvesting of crops o Onset (recovery) of (from) drought o Flooding o Mining activity o Coastal erosion O The growth stage of vegetation/crops O Under or overgrazing by cattle O Construction or destruction of buildings or changes to urban infrastructure The output images could be used to: o Predict the expected yield of a crop or the dry matter content, e.g. measured in kilograms per hectare o Predict the optimal time for harvesting based on the growth stage of a particular crop o Classify land-use (urban, agriculture, forest, etc.) O Identify signatures of crop stress o Classify soil type o Estimate the fraction of a patch of land that is bare (e.g. from overgrazing by cattle) o Estimate surface texture (e.g. different tillage) o Identify signatures of pest infestation or disease in crops and vegetation o Identify anomalies in the spectral response of a region relative to the surrounding area or regions with similar properties (e.g. fields of rapeseed) within an image taken at a single epoch, or between observations taken at different epochs o Identify regions of flooding or at risk from flooding O Measure the surface area of water in rivers, reservoirs, lakes, and other permanent or transient bodies of water O Identify regions affected by or under threat from wildfire O Identify temporal trends and/or statistical anomalies in the spectral response across the visible-infrared bands either on a pixel-by-pixel basis, or averaged/aggregated over multiple pixels (e.g. a field) o Identify natural and artificial boundaries such as hedges around a field o Measure changes to the track of a river O Identify and measure sites of coastal erosion O Identify and classify changes in land use, e.g. agricultural-to-urban, or due to industrial or military activity o Measure local and macroscopic trends related to normal and anomalous environmental conditions, for example, the emergence of leaves on deciduous trees across a country

Claims (55)

  1. Claims 1. A method of creating a mapping model for translating an input image to an output image, the method comprising obtaining an ensemble of training data T comprising a sample of pairs of matched images [R,V], providing a neural network and training the neural network with the training data T to obtain the mapping model V* = f(R) that translates input image R to output image V* where V* is equivalent to V in a flawless mapping.
  2. 2. A method according to claim 1 wherein the training data T comprises a plurality of real matched images [R,V].
  3. 3. A method according to claim 1 or claim 2 wherein the neural network comprises a generator and a discriminator.
  4. 4. A method according to any preceding claim comprising the following steps: 1) propagating R into the generator, wherein the generator produces V* which represents a "fake" version of V based on a transformation of R 2) associating V" with R to form new matched pair [R,V"] 3) propagating [R,V*] into the discriminator to determine the probability that V* is "real", wherein the probability that V* is "real" is estimated from a loss function that encodes the quantitative distance between V and V* 4) backpropagating the error defined by the loss function through the neural network.
  5. 5. A method according to claim 4 wherein there are N iterations of training steps 1 to 4 wherein T is sampled at each iteration.
  6. 6. A method according to claim 4 or claim 5 wherein the loss function is learnt by the neural network.
  7. 7. A method according to claim 4 or claim 5 wherein the loss function is hard-coded.
  8. 8. A method according to claim 4 or claim 5 wherein the loss function is a combination of hard-coding and learning by the neural network.
  9. 9. A method according to claim 8 wherein the loss function is a combination of a learnt GAN loss, and a Least Absolute Deviations (L1) loss, with the L1 loss weighted at a fraction of the GAN loss.
  10. 10. A method according to any preceding claim wherein each image in R and V is normalised.
  11. 11. A method according to claim 10 wherein normalisation comprises a rescaling of the input values to floating point values in a fixed range.
  12. 12. A method according to any preceding claim wherein the neural network comprises an encoder-decoder neural network.
  13. 13. A method according to any preceding claim wherein the neural network comprises a conditional GAN.
  14. 14. A method according to any preceding claim wherein the neural network comprises a fully convolutional conditional GAN.
  15. 15. A method according to any preceding claim when dependent on claim 4 wherein the backpropagation of the error defined by the loss function updates the weights in the neural network so that they follow the steepest descent of the loss between V and V*.
  16. 16. A method according to any preceding claim wherein R comprises at least one SAR image, encoded as a data matrix.
  17. 17. A method according to any preceding claim wherein V comprises at least one image in the visible-infrared spectral range, encoded as a data matrix.
  18. 18. A method according to any preceding claim wherein the visible-infrared spectral range is between about 400-2300 nanometres (nm).
  19. 19. A method according to any preceding claim wherein R is of size m x n of a patch of the Earth's surface spanning a physical region p x q.
  20. 20. A method according to any preceding claim wherein V is of size m x n of a patch of the Earth's surface spanning a physical region p x q.
  21. 21. A method according to any preceding claim wherein V is of size m x n at one or more frequencies across the visible-infrared spectral range.
  22. 22. A method according to any preceding claim wherein V* is of size m x n of a patch of the Earth's surface spanning a physical region p x q.
  23. 23. A method according to any preceding claim wherein V* is of size m x n at one or more frequencies across the visible-infrared spectral range.
  24. 24. A method according to any preceding claim wherein where there are a plurality of images R they are all recorded at a single radar frequency
  25. 25. A method according to any of claims 1 to 23 wherein where there are a plurality of images R they are recorded at multiple frequencies.
  26. 26. A method according to any preceding claim wherein where there are a plurality of images R they are all recorded at a single polarisation.
  27. 27. A method according to any of claims 1 to 26 wherein where there are a plurality of images R they are recorded at multiple polarisations.
  28. 28. A method according to any preceding claim wherein where there are a plurality of images R they are recorded at different detection orientations/incident angles.
  29. 29. A method according to any preceding claim wherein R further comprises additional information representing prior knowledge about the region of interest or the observing conditions of V and/or R.
  30. 30. A method according to claim 29 wherein the additional information includes but is not limited to a map of the surface elevation; a previously observed unobscured view in each visible-infrared spectral band; a map of the location of each pixel; time of year; and sun elevation/azimuth angle information.
  31. 31. A method according to claim 29 or claim 30 wherein the additional information is selected from one or more of: a map of the surface elevation; a previously recorded unobscured view in each visible-infrared spectral band; a map of the location of each pixel; time of year; and sun elevation/azimuth angle information.
  32. 32. An imaging apparatus for creating a mapping model for translating an input image to an output image using the method according to any of claims 1 to 31.
  33. 33. A method of translating an input image R to an output image V*, the method comprising obtaining a mapping model for translating an input image to an output image according to any of claims 1 to 31 inputting a new image R into the mapping model wherein the mapping model translates input image R and outputs image V*.
  34. 34. A method according to claim 33 wherein the input R comprises at least one SAR image, encoded as a data matrix.
  35. 35. A method according to claim 33 or claim 34 wherein the output V* comprises at least one image in the visible-infrared spectral range, encoded as a data matrix.
  36. 36. A method according to any of claims 33 to 35 wherein the visible-infrared spectral range is between about 400-2300 nanometres (nm).
  37. 37. A method according to any of claims 33 to 36 wherein the input image R is of size m x n.
  38. 38. A method according to any of claims 33 to 37 wherein the input image R is of size m x n of a patch of the Earth's surface spanning a physical region p x q.
  39. 39. A method according to any of claims 33 to 38 wherein the output image V* is of size m x n of a patch of the Earth's surface spanning a physical region p x q.
  40. 40. A method according to any of claims 33 to 39 wherein the output image V* is of size m x n at one or more frequencies across the visible-infrared spectral range.
  41. 41. A method according to any of claims 33 to 40 wherein where there are a plurality of input images R they are all recorded at a single radar frequency.
  42. 42. A method according to any of claims 33 to 40 wherein where there are a plurality of input images R they are recorded at multiple frequencies.
  43. 43. A method according to any of claims 33 to 42 wherein where there are a plurality of input images R they are all recorded at a single polarisation.
  44. 44. A method according to any of claims 33 to 42 wherein where there are a plurality of input images R they are recorded at multiple polarisations.
  45. 45. A method according to any of claims 33 to 44 wherein where there are a plurality of input images R they are recorded at different detection orientations/incident angles.
  46. 46. A method according to any of claims 33 to 40 wherein R further comprises additional information representing prior knowledge about the region of interest or the observing conditions.
  47. 47. A method according to claim 46 wherein the additional information includes but is not limited to a map of the surface elevation; a previously observed unobscured view in each visible-infrared spectral band; a map of the location of each pixel; time of year; and sun elevation/azimuth angle information.
  48. 48. A method according to claim 46 or claim 47 wherein the additional information is selected from one or more of: a map of the surface elevation; a previously observed unobscured view in each visible-infrared spectral band; a map of the location of each pixel; time of year; and sun elevation/azimuth angle information.
  49. 49. A method of predicting the visible-infrared band images of a region of the Earth's surface that would be observed by an EO satellite or other high-altitude imaging platform, using data from SAR imaging of the same region.
  50. 50. A method according to claim 49 used to predict images of the Earth's surface in the visible-infrared bands when the view between an imaging instrument and the ground is obscured by cloud or some other medium that is opaque to EM radiation in the visible-infrared spectral range, spanning approximately 400-2300 nanometres (nm), but transparent to EM radiation in the radio-/microwave part of the spectrum.
  51. 51. A method according to claim 49 or claim 50 using the method according to any of claims 33 to 48.
  52. 52. An imaging apparatus for translating an input image R to an output image V* according to any of claims 33 to 48.
  53. 53. An imaging apparatus for translating an input image R to an output image V* according to any of claims 49 to 51.
  54. 54. A method of generating a new set of images V+ at any frequency in the range approximately spanning 400-2300nm from V".
  55. 55. A method as claimed in claim 54 comprising the following steps: a) considering a pixel at coordinate (x,y) in each image in V*, wherein V" can be considered a set of images W = [VO, V1, V2, VN] wherein each image corresponds to an observed bandpass at some average wavelength of EM radiation and wherein the set of wavelengths associated with each image is lambda = [lambda0, lambda1, lambda2 lambdaN]; b) assuming a function S(x,y,lambda,p) represents the continuous spectral response of the Earth surface, where p are a set of parameters. S is described by Equation 1 and p represents 6 free parameters; c) finding p for each pixel (x,y) by fitting the function S(x,y,lambda,p) to (lambda,V*) d) creating a new set of images V+ covering the same region as V" by applying S(x,y,lambda,p) for any given wavelength lambda S(A) Loo(1 exP(----1,1 (A p2))) 1 + psi::< exp(---p4(.\ /1500m)) + ps exp( (A c)2 /292) Equation 1
GB2111302.2A 2019-08-13 2019-08-13 Method and apparatus Active GB2595122B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
GB2111302.2A GB2595122B (en) 2019-08-13 2019-08-13 Method and apparatus

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
GB1911577.3A GB2586245B (en) 2019-08-13 2019-08-13 Method and apparatus
GB2111302.2A GB2595122B (en) 2019-08-13 2019-08-13 Method and apparatus

Publications (2)

Publication Number Publication Date
GB2595122A true GB2595122A (en) 2021-11-17
GB2595122B GB2595122B (en) 2022-08-24

Family

ID=67990965

Family Applications (2)

Application Number Title Priority Date Filing Date
GB1911577.3A Active GB2586245B (en) 2019-08-13 2019-08-13 Method and apparatus
GB2111302.2A Active GB2595122B (en) 2019-08-13 2019-08-13 Method and apparatus

Family Applications Before (1)

Application Number Title Priority Date Filing Date
GB1911577.3A Active GB2586245B (en) 2019-08-13 2019-08-13 Method and apparatus

Country Status (1)

Country Link
GB (2) GB2586245B (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110988818B (en) * 2019-12-09 2023-03-17 西安电子科技大学 Cheating interference template generation method for countermeasure network based on condition generation formula
CN111833239B (en) * 2020-06-01 2023-08-01 北京百度网讯科技有限公司 Image translation method and device and image translation model training method and device
CN113762277B (en) * 2021-09-09 2024-05-24 东北大学 Multiband infrared image fusion method based on Cascade-GAN
CN114442092B (en) * 2021-12-31 2024-04-12 北京理工大学 SAR deep learning three-dimensional imaging method for distributed unmanned aerial vehicle
CN114581771B (en) * 2022-02-23 2023-04-25 南京信息工程大学 Method for detecting collapse building by high-resolution heterogeneous remote sensing
CN117975383B (en) * 2024-04-01 2024-06-21 湖北经济学院 Vehicle positioning and identifying method based on multi-mode image fusion technology

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109636742A (en) * 2018-11-23 2019-04-16 中国人民解放军空军研究院航空兵研究所 The SAR image of network and the mode conversion method of visible images are generated based on confrontation

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109636742A (en) * 2018-11-23 2019-04-16 中国人民解放军空军研究院航空兵研究所 The SAR image of network and the mode conversion method of visible images are generated based on confrontation

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Enomoto, Kenji, et al. "Image translation between SAR and optical imagery with generative adversarial nets." IGARSS 2018-2018 IEEE International Geoscience and Remote Sensing Symposium. IEEE, 2018 *
Wang, Puyang, and Vishal M. Patel. "Generating high quality visible images from SAR images using CNNs."2018 IEEE Radar Conference (RadarConf18). IEEE, 2018 *

Also Published As

Publication number Publication date
GB2586245A (en) 2021-02-17
GB201911577D0 (en) 2019-09-25
GB2595122B (en) 2022-08-24
GB2586245B (en) 2021-09-22

Similar Documents

Publication Publication Date Title
US20220335715A1 (en) Predicting visible/infrared band images using radar reflectance/backscatter images of a terrestrial region
GB2595122A (en) Method and apparatus
Yang et al. Estimation of corn yield based on hyperspectral imagery and convolutional neural network
Farmonov et al. Crop type classification by DESIS hyperspectral imagery and machine learning algorithms
Sivanpillai et al. Rapid flood inundation mapping by differencing water indices from pre-and post-flood Landsat images
Stroppiana et al. A method for extracting burned areas from Landsat TM/ETM+ images by soft aggregation of multiple Spectral Indices and a region growing algorithm
Pettorelli Satellite remote sensing and the management of natural resources
Peter et al. Multi-spatial resolution satellite and sUAS imagery for precision agriculture on smallholder farms in Malawi
Tang et al. Integrating spatio-temporal-spectral information for downscaling Sentinel-3 OLCI images
Simpson Remote sensing in fisheries: a tool for better management in the utilization of a renewable resource
de Souza Moreno et al. Deep semantic segmentation of mangroves in Brazil combining spatial, temporal, and polarization data from Sentinel-1 time series
Vlachopoulos et al. Evaluation of crop health status with UAS multispectral imagery
Mohammadpour et al. Applications of Multi‐Source and Multi‐Sensor Data Fusion of Remote Sensing for Forest Species Mapping
Wang et al. [Retracted] Remote Sensing Satellite Image‐Based Monitoring of Agricultural Ecosystem
Ramdani et al. Inexpensive method to assess mangroves forest through the use of open source software and data available freely in public domain
Marcaccio et al. Potential use of remote sensing to support the management of freshwater fish habitat in Canada
Wang et al. Data fusion in data scarce areas using a back-propagation artificial neural network model: a case study of the South China Sea
Sexton et al. Earth science data records of global forest cover and change
Khandelwal et al. Cloudnet: A deep learning approach for mitigating occlusions in landsat-8 imagery using data coalescence
Hamzeh et al. Retrieval of Sugarcane Leaf Area Index From Prisma Hyperspectral Data
Hazaymeh Development of a remote sensing-based agriculture monitoring drought index and its application over semi-arid region
Zhang et al. Space-Based Mapping of Mangrove Canopy Height with Multi-Sensor Observations and Deep Learning Techniques
Araújo et al. Satellite and UAV-based anomaly detection in vineyards
Malavé Mapping wetlands in Sweden using multi-source satellite data and Random Forest algorithm
EP4250250A1 (en) Carbon soil backend