5 Steps to Building a Better Bog Detector

Rob McEwan

This is part two of a two-part series

In part one of this series, we defined GeoAI and summarized the current state of GeoAI applications in the mineral exploration industry. This section describes how we ran an end-to-end computer vision model to detect iron-stained outcrops using semantic segmentation and self-organizing maps on multi-platform remote sensing data.

The presence of iron oxides is one of the few diagnostic features that can assist mineral exploration efforts in the visual spectrum. Although multispectral and hyperspectral imagery can reveal the presence of many alterations and pathfinder minerals, open-source data for a thorough spectral analysis of alteration minerals is limited to 30m pixel resolution.

While a 30m resolution is fit for regional analyses, local analysis has some significant limitations due to the loss of an unknown number of potential outcrops at the subpixel level. We had access to 0.5m resolution RGB imagery for our project, distinguishing those smaller outcrop exposures.

There is one main issue with using iron staining as the metric to train a computer vision classifier. The ‘red tint’ indicative of iron oxidation on outcrops can be easily confused with non-green vegetation, such as lichen and certain sphagnum mosses and bryophytes, the latter of which are prevalent in bogs and fens.

With some inspiration from a post by Descartes Labs, we decided to test a similar approach using time-series radar data to detect wetlands based on seasonal variability.

Bogs are a type of wetland characterized by variable acidity, salinity, and water level fluctuations. Our objective was to utilize a year’s worth of open-source SAR data from our study area and filter out potential wetlands based on the variation throughout the year, along with the RGB visualizations and a regional digital elevation model. Afterward, we could assess our original training data quality and make adjustments to the model as necessary.

SAR is very sensitive to water, and an analysis of time series SAR data can help distinguish that seasonal variability. Similar to ensemble learning methods, a multi-platform approach to land-cover classification outperforms single platform solutions in remote sensing.

notion image

Figure 1. Data source array from the project area. Digital elevation model, RGB image, RGB image with training data, and time-series SAR variation image, from left to right.

Let’s summarize the approach:

Objective: Use computer vision to detect iron oxides on outcrops

Issue: “Red tint” confuses the model between iron oxides and particular wetland vegetation

Solution: Integrate time-series radar imagery to filter out wetland pixels from the classification

Data sources: 0.5m aerial RGB imagery, 10m Sentinel-1 SAR imagery

1.Create training data

The initial training set was manually digitized from the RGB image with class labels for bog, outcrop, oxidized (iron staining), forest, water, and fluvial/lacustrine sediments (river and lake sediments). This dataset then trained our semantic segmentation model and as a baseline for assessing the seasonal SAR variations.

Keep in mind that since we started with only the single RGB image, these polygons may not be entirely accurate or complete. Nevertheless, we can still use them to get an idea of whether the method will work or not.

2. Train and run the computer vision model

There are two inputs to the CNN:

An raster image that contains three bands,A label image that contains the label for each pixel corresponding to to iron staining

  • An raster image that contains three bands,
  • A label image that contains the label for each pixel corresponding to to iron staining

We used the Unet architecture to train our semantic segmentation model.

U-net was originally invented and first used for biomedical image segmentation. Its architecture can be broadly thought of as an encoder network followed by a decoder network. Unlike classification where the end result of the the deep network is the only important thing, semantic segmentation not only requires discrimination at pixel level but also a mechanism to project the discriminative features learnt at different stages of the encoder onto the pixel space.

notion image

Blue boxes represent multi-channel feature maps, while while boxes represent copied feature maps.

With our training features digitized, we trained our model and predicted the locations of iron oxide staining in the image.

As expected, there are several false positives within probable wetlands.

3. Acquire time-series SAR data

Google Earth Engine is a fantastic resource for collecting remote sensing data, particularly time-series data. If you’re unfamiliar with Google Earth Engine, there is an excellent introduction to aggregators and reducers here with code examples.

We filtered all of the Sentinel-1 data from 2019 into twelve images, one for each month, aggregated by a median reducer. Each image contained three bands:

Red = Vertical-vertical (VV) polarization

Green = Vertical-Horizontal (VH) polarization

Blue = The VV:VH ratio

Band polarizations for SAR are outside of the scope of this post and still an ongoing subject of research. As a brief summary, SAR wavelengths are much longer when compared to optical wavelengths. Because of this, radar waves are sensitive to different textural habits of objects on the ground, like tree canopies and surface roughness of outcrops, for example. Radar waves are also susceptible to the dielectric constant of surface materials, which is why radar is excellent at detecting the presence of water.

Satellites with the capability of sending polarized radar can take advantage of the fact that, in addition to scattering (SAR pixel intensities are a measurement of surface scattering), certain materials will change the polarization of the emitted energy when they react with it. Using different polarizations and the ratio between the two, we can create an RGB composite, which is very handy for visualizations and adds an additional dimension to the data.

To get an image representing yearly variation, we layered all of the images together, calculated the standard deviation for each pixel through the 12 images, then created a new image out of those values.

By taking a sample of each pixel value within each class and plotting by standard deviation, we get a look at the distribution of radar values throughout the year. Suppose we focus specifically on the VV polarization band (i.e., the red band when mapped to RGB color space). In that case, there appears to be enough distinction between wetland classes and iron-stained outcrops.

notion image

4. Classify the SAR

Kohonen Self-Organizing Maps (SOM) has recently received more and more attention as an under-utilized artificial neural network. SOMs are extremely useful when you have no training data available and no clear indication of how many clusters are in your data. They also differ from other artificial neural networks by using competitive learning instead of error correction learning.

SOMs are essentially the neural-network equivalent of KMeans clustering, except the number clusters in your dataset don’t need to be specified. Additionally, SOMs can be very quick to train, allowing you to observe how adjusting different model parameters affects the network quickly.

This last part is beneficial since the trained network is a 2D map, with the x and y coordinates corresponding to specific neurons. Neurons in the net are sometimes known as units, and the winning neuron for any pixel in the image is the best matching unit or BMU for that pixel. The sigma and neighborhood functions determine the degree of influence that winning neurons have on surrounding neurons, resulting in many discrete clusters or fewer, less defined clusters.

notion image

Figure 3. Two SOM networks trained on time-series SAR data with a high sigma value (left) and a low sigma value (right).

There is some supervision element required to convert the SOM output back to labeled classes in either case. We used KMeans to assign labels for each SOM cluster and then sampled the neural net to assign the winning class label back to each pixel.

5. Overlay and filter the results

With our classified time-series SAR image and the outputs from our computer vision model, we combined the results by running a difference operation. This operation erased all of the outcrop and iron oxide predictions within the boundaries of our likely wetland classes. Since we clipped 0.5m resolution images with 10m resolution, there was some loss of resolution. But the radar-derived wetland classes did a terrific job at removing the false positives.

notion image

Figure 4. The output of the combined model before and after difference operation. A) RGB image B) image with outcrop predictions from computer vision model in red C) filtered outcrop in green overlaid with original outcrop predictions D) filtered outcrop with visible wetlands.

notion image

The result is a much cleaner set of predictions. By significantly reducing the number of false positives, we are left with higher quality targets to focus exploration efforts. Additionally, identifying areas with high water fluctuations associated with wetlands can assist in project planning and mitigating environmental risk

Rob McEwan