Hack your own satellite deforestation tracker

April 04, 2019

Introduction

This post will show you how to hack together your own deforestation monitor that uses freely available satellite data and open source web technologies.

The tech I’m using is NodeJS, React and Mapbox, and the app is a static website so no server is required. It’s going to be quick & dirty for simplicity, so YMMV.

Here’s the result:

Deforestation tracker app

and it is deployed here: https://deforestation-tracker.bjnortier.now.sh/

The code is available on Github.

Part 1

The satellite data

We’ll be using NDVI imagery. NDVI is basically an index that uses visible and near-infrared values to compute a vegetation index:

“The normalized difference vegetation index (NDVI) is a simple graphical indicator that can be used to analyze remote sensing measurements, typically, but not necessarily, from a space platform, and assess whether the target being observed contains live green vegetation or not.” Wikipedia

You can get fairly current (delayed by a week or so) Proba-V NDVI data for free at 1km resolution (one pixel represents a 1km x 1km square on the earth’s surface). Proba-V is a European Space Agency (ESA) satellite, Thank you ESA for making this data freely available!

All you need to know for now is that the higher the NDVI index, the more photosynthesis, the more vegetation is being observed. More vegetation = more plants = more carbon sequestration.

The geographic area

I’m going to use the Kruger National Park as an example area, as it’s my country’s (🇿🇦) largest national park.

The boundary of the area is defined in GeoJSON, so you can adapt the code to any boundary of your choosing (if you have the GeoJSON and download the appropriate imagery data).

You can use geojson.io to easily visualise GeoJSON files, and here is our boundary:

KNP Boundary

Part 2

The code required has been published in a git repo. What you need before starting is

  1. A valid NodeJS environment https://nodejs.org/en/
  2. A valid Mapbox token https://www.mapbox.com

Clone the repo

Clone the repo and install all dependencies:

$ git clone https://github.com/bjnortier/deforestation-tracker.git
$ cd deforestation-tracker
$ npm i

Create a .env file which contains the Mapbox token, e.g. mine looks like this:

$ cat .env
export MAPBOX_TOKEN=pk.ey...

Get the Proba-V HDF5 files

The “Terms of Use” prohibits me from posting the data files to a network so you’ll unfortunately have to download them yourself.

Navigate to http://proba-v.vgt.vito.be/en/product-types, choose 1 KM (free data), then click on “S10-TOC-NDVI - Global 10-daily composites, Top-Of-Canopy, Normalized Difference Vegetation Index band”, or you can use this link.

A free signup or required to download the HDF5 files, so register first (top right-hand corner).

We’ll be looking at the Kruger National Park (KNP), so choose the following Region Of Interest (ROI). This defines which tiles will be part of the download:

Longitude: from 30 to 39. Latitude: from -15 to -24.

For “Date” we’ll choose 15/1/2010-01/01/2020 Note United States Users - the date format is not US, but World format, i.e. DD/MM/YYYY.

Click on “Search”.

There result will show many rows in the table, with Product IDs like “PVS10TOCNDVI-201902111KM_V101”.

Select the left-most checkbox and click “Order now…”

You will receive an email with the FTP link to download.

Once you have downloaded the files, copy them to resources/. The file listing should looks similar to this:

$ ls -h resources/
PV_S10_TOC_NDVI_20131011_1KM_V101
PV_S10_TOC_NDVI_20150811_1KM_V101
PV_S10_TOC_NDVI_20170611_1KM_V101
PV_S10_TOC_NDVI_20131111_1KM_V101
...

Each directory contains the tiles for a 10 day period.

The HDF files should total to about 144MB:

$ du -h resources
...
144M	resources

Create the masked PNG files

The raw NDVI data is delivered as HDF5 files, in tiles, for each time period.

We’ll read the HDF data and create a masked PNG for each tile, and at the same time count the NDVI values for each 1km x 1km area.

NB. We’ll be using NodeJS for this one-off process. Yes it is rather slow and you can do it way faster with something like python and parallel operations, but I’m keeping things simple with one programming language.

Run the script and while that is running you can look at the code:

$ node src/scripts/createKNP.js
>> 20131011 X21Y09.png
>> 20131011 X21Y10.png
[606000/1254400]
...

The script src/scripts/createKNP.js will traverse all the tile directories, but the interesting code is in src/scripts/createPNGs.js.

This function will:

  1. Create PNG tiles which contains the NDVI index in the GREEN band, and areas outside the boundary will have an opacity of 0. (The raw data actually ranges from 0 to 254, where 255 indicates no data).
  2. Compute the accumulated NDVI index value.
  3. Compute the number of pixels in the area of interest (used to compute the average NDVI value).

Read the HDF5 data:

There is a free viewer you can use to look at the contents of an HDF file, which will help you figure out the structure:

HDF5Viewer

/**
 * Open the HDF file for this tile in the current period,
 * and read the longitude, latitude and NDVI values.
 */
const file = new hdf5.File(path.resolve(path.join(hdf5SourceDir, hdf5Filename)))
const lons = h5lt.readDataset(file.id, 'lon')
const lats = h5lt.readDataset(file.id, 'lat')
const ndvi = h5lt.readDataset(file.openGroup('LEVEL3').openGroup('NDVI').id, 'NDVI')

Extract the X and Y tile number

Each tile has an X and Y tile number, which makes referencing easier. The Proba-V User Manual describes this scheme (also used in the HDF filename scheme):

Tile Numbering

/**
 * Extract the X and Y tile numbers.
 */
const match = /PROBAV_S10_TOC_X([0-9]{2})Y([0-9]{2})_[0-9]{8}_1KM_NDVI_V10[12]{1}.hdf5/.exec(hdf5Filename)
const xTile = match[1]
const yTile = match[2]
const tileName = `X${xTile}Y${yTile}`

Get the tile boundary coordinates

We read the tile boundaries and store them in a metadata structure:

/**
 * Read the tile boundaries from the HDF5 file
 */
const geometry = file.openGroup('LEVEL3/GEOMETRY')
geometry.refresh()
const meta = {
  bottomLeft: [geometry.BOTTOM_LEFT_LONGITUDE, geometry.BOTTOM_LEFT_LATITUDE],
  topLeft: [geometry.TOP_LEFT_LONGITUDE, geometry.TOP_LEFT_LATITUDE],
  bottomRight: [geometry.BOTTOM_RIGHT_LONGITUDE, geometry.BOTTOM_RIGHT_LATITUDE],
  topRight: [geometry.TOP_RIGHT_LONGITUDE, geometry.TOP_RIGHT_LATITUDE]
}
tilesMeta[tileName] = meta

Write a PNG for each tile

/**
 * Create a PNG file for each tile.
 *
 * 1. Use the NDVI value for the GREEN channel.
 *    The blue & red channels are kept at zero.
 * 2. Some NDVI values are null so opacity is 0
 *    for those values.
 * 3. Check each pixel for containment within
 *    the geographic boundary, ignore if outside.
 *    If the pixel is inside the boundary,
 *    accumulate it's value.
 */
const pngFilename = `${tileName}.png`
const pngPath = path.join(outputDir, pngFilename)
var png = new PNG({
  width: lons.length,
  height: lats.length,
  colorType: 6
})
for (var y = 0; y < png.height; y++) {
  for (var x = 0; x < png.width; x++) {
    const idx = (png.width * y + x) << 2
    const ndviForPixel = ndvi[png.width * y + x]
    png.data[idx] = 0
    png.data[idx + 1] = ndviForPixel
    png.data[idx + 2] = 0
    png.data[idx + 3] = ndviForPixel === 255 ? 0 : 255

    if ((png.width * y + x) % 100 === 0) {
      // Progress logging
      process.stdout.cursorTo(0)
      process.stdout.write(`[${png.width * y + x}/${png.height * png.width}]`)
    }
    if (ndvi[png.width * y + x] === 255) {
      // NDVI value is null if it is 255
      png.data[idx + 3] = 0
    } else {
      // Crop inside boundary
      const lon = geometry.BOTTOM_LEFT_LONGITUDE + 1 / 112 * x
      const lat = geometry.TOP_RIGHT_LATITUDE - 1 / 112 * y
      const coords = [lon, lat]
      const inside = booleanPointInPolygon(coords, knpBoundary.features[0].geometry)
      if (inside) {
        png.data[idx + 3] = 255
        numberOfPixels += 1
        accumulatedNDVI += ndviForPixel
      } else {
        png.data[idx + 3] = 0
      }
    }
  }
}
// Progress logging
process.stdout.cursorTo(0)
console.log('>>', date, pngFilename)

// Write the PNG file
const buffer = PNG.sync.write(png)
fs.writeFileSync(pngPath, buffer)

The results

The script will create PNG files that are trimmed to the geographic boundary containing NDVI values in the GREEN channel, e.g.:

X21Y09.png X21Y10.png

The script will also generate a JSON file (/static/KNP/site.json) with all the metadata, including the bounds of each tile and the accumulated NDVI data and number of pixels (use to calculate an average)

Part 3

Now that we have all the required data, we can build a React web app to render the results on a Mapbox map and a graph.

If you’re impatient you can just run the app (make sure you set the Mapbox Token as described in Part 2:

$ npm run start

> deforestation-tracker@1.0.0 start /Users/bjnortier/development/deforestation-tracker
> source .env && webpack-dev-server

ℹ 「wds」: Project is running at http://localhost:7333/
ℹ 「wds」: webpack output is served from /static/bundles/
ℹ 「wds」: 404s will fallback to /index.html
ℹ 「wdm」: wait until bundle finished: /static/bundles/index.bundle.js
ℹ 「wdm」: Hash: abc8487325facbe3c468
Version: webpack 4.29.5
Time: 4206ms
Built at: 04/03/2019 12:23:57 PM
          Asset      Size  Chunks             Chunk Names
index.bundle.js  5.49 MiB   index  [emitted]  index
Entrypoint index = index.bundle.js
[0] multi (webpack)-dev-server/client?http://localhost:7333 webpack-dev-server/client?http://localhost:7333 webpack/hot/dev-server ./src/app/index 64 bytes {index} [built]
[./node_modules/loglevel/lib/loglevel.js] 7.68 KiB {index} [built]
[./node_modules/querystring-es3/index.js] 127 bytes {index} [built]
[./node_modules/react-dom/index.js] 1.33 KiB {index} [built]
[./node_modules/react/index.js] 190 bytes {index} [built]
[./node_modules/strip-ansi/index.js] 161 bytes {index} [built]
[./node_modules/styled-components/dist/styled-components.browser.esm.js] 76.8 KiB {index} [built]
[./node_modules/styled-normalize/dist/index.js] 2.58 KiB {index} [built]
[./node_modules/url/url.js] 22.8 KiB {index} [built]
[./node_modules/webpack-dev-server/client/index.js?http://localhost:7333] (webpack)-dev-server/client?http://localhost:7333 8.1 KiB {index} [built]
[./node_modules/webpack-dev-server/client/overlay.js] (webpack)-dev-server/client/overlay.js 3.59 KiB {index} [built]
[./node_modules/webpack-dev-server/client/socket.js] (webpack)-dev-server/client/socket.js 1.05 KiB {index} [built]
[./node_modules/webpack/hot sync ^\.\/log$] (webpack)/hot sync nonrecursive ^\.\/log$ 170 bytes {index} [built]
[./node_modules/webpack/hot/dev-server.js] (webpack)/hot/dev-server.js 1.61 KiB {index} [built]
[./src/app/index.js] 1.03 KiB {index} [built]
    + 823 hidden modules
ℹ 「wdm」: Compiled successfully.

which will run the app at http://localhost:7333/.

A reminder that code is available on Github, or if you just want to play with the result it is deployed here: https://deforestation-tracker.bjnortier.now.sh/

Deforestation tracker app

Fetching the site data

Al the data is now located under /static/, so we don’t need a server and can deploy the app as a static website.

The main metadata file, site.json, is located under /static/KNP/site.json, and contains all the dates in the data range, as well as the tile metadata, and NDVI values. Once loaded we set the state of the React component to contain the site data, and set currentIndex to the latest time period:

componentDidMount () {
  fetch(`/static/KNP/site.json`, {
    credentials: 'same-origin',
    headers: {
      'Accept': 'application/json'
    }
  })
    .then(response => {
      return Promise.all([response.status, response.json()])
    })
    .then(([status, json]) => {
      if (status === 200) {
        this.setState({
          site: json,
          currentIndex: Object.keys(json.data).length - 1
        })
      } else {
        this.setState({ error: json })
      }
    }, err => {
      console.error(err)
      this.setState({ error: err.message })
    })
}

The Graph

Source code

The graph is an SVG graph using a library I wrote, react-svg-graphs.

Since this is a time series graph the X values have to be valid timestamps:

const xValues = dates.map(date => {
  const match = /([0-9]{4})([0-9]{2})([0-9]{2})/.exec(date)
  const year = match[1]
  const month = match[2]
  const day = match[3]
  return new Date(`${year}-${month}-${day}`).getTime()
})

And the Y values are a scaled NDVI value:

const yValues = dates.map(date => siteData[date].accumulatedNDVI / 1e6)

The graph is further configured with the X and Y values and labels (note the Y data is an array as multiple time series can be displayed).:

const graphData = {
  x: { label: 't', values: xValues },
  y: [{ label: 'Acc. NDVI', values: yValues }]
}

And finally the React component, which will render an SVG graph that is automatically resized:

return <ReactResizeDetector handleWidth>
  {(width, height) => <TimeXScalarYGraph
    data={graphData}
    width={width || 640}
    height={200}
    title='Kruger National Park Accumulated NDVI 1e6'
    periodLabel='7y'
    onHover={onHover}
  />}
</ReactResizeDetector>
}

The Map

Source code

The Map is quite complex with considerable boilerplate, but what’s important is that each date and tile is added as a layer, and these layers are hidden or shown depending on which date point is selected.

A source and layer is created for each tile (I think you can use multiple tiles in one layer but I haven’t figured that out yet…):

for (let j = 0; j < dates.length; ++j) {
  const date = dates[j]
  const tilesMeta = site.data[date].tilesMeta
  const tileNames = Object.keys(tilesMeta)
  for (let i = 0; i < tileNames.length; ++i) {
    const tileName = tileNames[i]
    const tileMeta = tilesMeta[tileName]
    const id = `${date}/${tileName}`
    map.addSource(id, {
      'type': 'image',
      'url': `/static/KNP/${date}/${tileName}.png`,
      'coordinates': [
        tileMeta.topLeft,
        tileMeta.topRight,
        tileMeta.bottomRight,
        tileMeta.bottomLeft
      ]
    })
    ++imageCount
    map.addLayer({
      id,
      source: id,
      type: 'raster',
      layout: {
        visibility: 'none'
      },
      paint: {
        'raster-opacity': 1
      }
    })
    map.moveLayer(id, 'waterway')
  }

and Mapbox has built-in support for rendering GeoJSON so we add the boundary as well:

map.addLayer({
  id: '`zaf-boundary`',
  type: 'line',
  source: {
    type: 'geojson',
    data: {
      type: 'Feature',
      geometry: site.boundary.features[0].geometry
    }
  },
  layout: {},
  paint: {
    'line-color': '#ff7f0e',
    'line-width': 2
  }
})

In the render method, depending on the selected date, we show that layer and hide the rest:

const map = this.getMap()
Object.keys(site.data).forEach((date, i) => {
  Object.keys(site.data[date].tilesMeta).forEach(tileName => {
    if (date === currentDate) {
      map.setLayoutProperty(`${date}/${tileName}`, 'visibility', 'visible')
    } else {
      map.setLayoutProperty(`${date}/${tileName}`, 'visibility', 'none')
    }
  })
})

And hovering over a specific date will show those tiles:

Deforest App in action

Next

That’s it.

There is a reason this is called a “Hack”. It’s a (albeit working) experiment but quite slow and only supports a single site.

If you look carefully there is a problem with the image not lining up with the boundary - I’m not sure what the cause is and needs to be fixed.

If I find a need to do this properly, these performance issues have to be looked at closely, as well as the tools to create & visualise the data for any user-defined site.