1.0 Apply image classification (supervised or unsupervised) - How To Guide

Modified on Thu, 19 Jan 2023 at 03:10 PM

In Earth observation, one of the common tasks to apply to satellite data is "classification".   This is the grouping together of similar features within the satellite image into tangible thematic classes. The focus in Earth Blox is the classifying of a satellite image collection based on the spectral properties of each pixel and how they relate to ground cover classes of interest such as woodland, agricultural land, urban, etc. 



Apply image classification (supervised or unsupervised)


TABLE OF CONTENTS




Background: A quick primer on classification


The topic of “classification” is about identifying groups of pixels that have similar image properties, the purpose being to map landscapes and identify areas (rather than pixels) that are similar. In civilian remote sensing, this is best exemplified by crop mapping — the task here is to place each field into a discrete category, such as wheat, barley, potatoes, etc.  Each pixel in the image in question will be assigned to one of these discrete classes.


Methods that use a classification approach to characterise a target area that is actually on a continuous numeric (ratio) scale are more appropriately called thresholding, rather than classification. An example would be, say, classifying the boundaries between natural forests and woodland, which, even on the ground, can be poorly defined and difficult to measure in natural environments. That kind of problem uses a classification approach (a discrete scale of output classes) to draw arbitrary boundaries within an otherwise continuous variable. Here, we will stick to the example of crop mapping as this is where classification has a strong track record.


Below are three images from Sentinel 2 over a rural location west of Regina in Saskatchewan, Canada.  Sentinel 2 has 11 bands, but we are just showing three to keep the description easy to follow.  From left to right, these are green, red and NIR bands.

 


Each band is slightly different. The NIR looks the brightest, which is what we might expect for an area covered in vegetation.  But there are varying shades for each field, and if you look closely, you will see that the fields are not the same relative brightness across each band.  Let us focus in on a small area so we can see this in detail.



These three pictures are a close-up of the above images.  Now we can look at individual fields and consider how they vary across each spectral band.   Note how these three fields all exhibit different patterns across the bands:

Field 1 is comparatively dark in all three images

Field 2 is comparatively light in green and NIR, but dark in the red.

Field 3 is comparatively light in all three images.

These trends can be for different reasons. Perhaps the crops are all the same, but at different stages of growth, or perhaps some are fully grown and others are harvested.  In an ideal circumstance, the differences will only be indicative of different crop types, and we can use this information to map (and quantify) the area of each crop.

 

To illustrate how classification works, let us simplify even further and look at only the red and NIR bands, and plot the pixel values on a pair of axes, one axis for each band (the left-hand figure below).  Fields 1 and 2 are both dark in the red, but they are separated in the NIR, where 2 is brighter than 1. And while 2 and 3 are both bright in NIR, they are distinguished by 3 being brighter than 2 in the red.   This is the principle by which we group the pixels.  Every pixel close to the centre of Field 1 pixels will be classed the same as Field 1, every pixel close to the centre of Field 2 as Field 2, etc. 




The challenge is that real data is not so neat. The second figure shows a more realistic scenario where there is a wide spread of pixels.  The pixels here are colour-coded based on which class they should be in, but you will notice how some of the pixels start to lie in overlapping zones. This is where errors in the classification will happen.  It might be due to imperfections in the image, such as impact from clouds, or it may be variations in the fields, such as different crop densities, or pixels that cross between a boundary of two classes.  


Classification is well suited for landscapes with well-defined discrete areas that can be allocated to a particular category.  In a more natural and constantly varying landscape, classification works less well as there are no natural groupings, and so there is no clumping of the pixels either. 


When using data such as Sentinel 2, or Landsat, we are not constrained by just two channels – we have several. And each band adds to the ability to separate each field.  This is difficult to draw in 3-dimensions but is impossible to draw in 4 or more dimensions of data, which is what we have when we classify with multiple bands.


Supervised vs Unsupervised Classification

There are two broad types of classification of satellite images. Unsupervised classification only considers the data, looking for common patterns in the multiple-image dimensions.  Where pixels seem to clump together (in terms of spectral properties) they are allocated to the same class. 


Supervised classification is used when you have some knowledge of “the right answers” – that is, you know what each class is that you are trying to determine and you can select some known areas to “train” the classification algorithm. Again, this is best exemplified using agricultural examples (although both methods might be used in other applications).


Note that classification needn’t just use spectral information. It can also include temporal variation, texture, or (for radar sensors) polarisation channels.


Unsupervised Classification

If we do not know what is within the image scene, we have to use unsupervised classification. This is when the analysis looks for statistical groupings of the pixels across the bands being used.  The user input is only to define how many groupings should be found, but the definitions of those groups within the data are determined by the data, not the user.

 

Supervised Classification

Sometimes we know some of the features on the surface and then we can use supervised classification.  This allows you to utilise a small area with field data to “train” the classification algorithm to recognise the same categories over a much larger area.  So, you may have some information on what crop is being grown in some fields. In supervised classification, we use this information to label the spectral groupings within the data based on those pixels we know.  The other pixels in that same grouping are then also labelled with the same class. 

 



How to do classification on Earth Blox - Step by Step Guide


Step 1: Find the Classification Blocks

  • In Earth Blox, the classification blocks are under ANALYSE in the Toolbox.   
  • There are two main blocks: one for supervised and one for unsupervised classification.  We will look at the unsupervised first, as that is much simpler.


Step 2: Download the example area files

The area we are going to explore is just West of Regina, Saskatchewan. This is a good example because the fields are large and regular and because we were able to find some corroborating data on which fields have which crops.


To follow this example, use the area-of-interest files that are attached to this article (at the bottom of this page).  You should download the following six files:

crop-OSR-regina.geojson

crop-grass-regina.geojson

crop-pulses-regina.geojson

crop-other2-regina.geojson

crop-cereals2-regina.geojson

crop-classification-Area1-Regina-Canada.geojson


This contains the area that we are going to classify and the training areas for 5 different crops. The data are stored as GeoJson files.  GeoJson files are files that contain all the coordinates of the area of interest.  Save them somewhere handy.



Unsupervised Classification


Step 1: Upload the area of interest

  • First, you need to upload the area of interest.  With a freshly opened Earth Blox window, click on the “upload area” icon on the right-hand side of the map.
  • Open the file:  crop-classification-Area1-Regina-Canada.geojson
  • When you open this file you will get a box in the middle of Canada – it will zoom to the box, so you will have to zoom out to see the location in context.
  • If you switch on the “satellite” view and look at the area of interest, you will see it is mostly agricultural fields, with a couple of rivers running through the area.  We aren’t going to classify this background satellite image, as it only has the visible bands.  We are going to use Sentinel 2 instead, and take advantage of the NIR bands that we know are useful for picking out details about vegetation. 

Step 2: Import the Sentinel 2 data

  • In Earth Blox, we need to choose the input data block first.  You then snap in the classification blocks.  
  • So, choose Sentinel 2 data within an optical block, and select dates from 31 May 2019 to 30 Sept 2019.  
  • Click in the Unsupervised classification block.
  • Add an Other data visualisation block. It should look like this:
  • When you click Run workflow it will now analyse the data to look for 5 “clumps” of pixels that have similar spectral properties (the Earth Blox classification only uses spectral response).
  • The result will show that some of the fields clearly have a sufficiently consistent spectral response that they end up in the same class.  
  • This is a useful approach when investigating data that you haven’t seen before, and are looking to see patterns in the data.  
  • But to know what the crops are here, we need to have some data on the ground, and then we can use supervised classification.



Supervised Classification


Step 1: Upload the area of interest (same as in unsupervised classification)

  • First, you need to upload the area of interest.  With a freshly opened Earth Blox window, click on the “upload area” icon on the right-hand side of the map.
  • Open the file:  crop-classification-Area1-Regina-Canada.geojson
  • When you open this file you will get a box in the middle of Canada – it will zoom to the box, so you will have to zoom out to see the location in context.
  • If you switch on the “satellite” view and look at the area of interest, you will see it is mostly agricultural fields, with a couple of rivers running through the area.  We aren’t going to classify this background satellite image, as it only has the visible bands.  We are going to use Sentinel 2 instead, and take advantage of the NIR bands that we know are useful for picking out details about vegetation. 


Step 2: Build the supervised classification workflow

  • Replace the unsupervised block with the supervised block.
  • Note that the drop-down menu gives you 3 different classification algorithms.   The results will largely be similar, but I leave it to you to experiment by comparing the outputs.  You can find out more detail about what is going on "under the hood" by looking at the Earth Engine description:  https://developers.google.com/earth-engine/guides/classification
  • Check the Include class statistics checkbox, and select Area (m^2).



Step 3: Build the supervised classification workflow

  • You will notice that this block has space to add more blocks. This is where we add the class block, for each class that we have ground data. 
  • For this example, we are going to have 5 different crop classes.  For each class block, we will specify training areas (provided for you in the attached documents at the bottom of this article).
  • If this sounds a little complicated, go to the workflow library and take a look at the supervised classification workflow.  Note that this is quite a data-intensive classification, so if you run it, you may have to wait some time to see the results.  But it illustrates what a classification block looks like. 
  • Now, for your task do the following steps to define the training sets for each class (the type of crop is in the file name).
    • Drag in a Class block inside the Supervised Classification block. 
    • Hover over User Areas Click on Add area on the map.
    • Click on Upload Area and choose one of the “crop-….” files. You will see several rectangles appear on the map.
    • Name the class in the class block to correspond to the name of the area file you uploaded.
    • Make sure the Area number and class name correspond to the Area number on the map and the shape file.
    • Repeat the above steps until you have all 5 classes.
  • Then make sure you have a visualisation block: "other data visualisation".  It should look like this:


  • Finally, run the workflow.
  • You will get an image on that map where each colour represents a different class. The legend will show you which is which.


Step 4: Dashboard output

  • The final step is to get some quantitative data from this process.
  • From the OUTPUT->DASHBOARDS toolbox, add the Table block beneath the visualisation block.
  • When you run the workflow this time, you can click on the DASHBOARD tab and see statistics, such as the area of each class.
  • From the Parameter option next to the Include Class Statistics checkbox, you can also choose to see the results in terms of pixel count or percentage area.
  • The dashboard tables for supervised classification will present statistics on how accurate the classification was (using your training data to "check the answers"). There is also a confusion matrix, from which these accuracy statistics have been calculated.
  • It will also tell you how much area has been classified under each class



 

We always track the feedback you give us so that we can improve how we help you. 

Please let us know if you found this one helpful.




Was this article helpful?

That’s Great!

Thank you for your feedback

Sorry! We couldn't be helpful

Thank you for your feedback

Let us know how can we improve this article!

Select atleast one of the reasons

Feedback sent

We appreciate your effort and will try to fix the article