Create a cloud-free image of your area of interest - How To Guide

Modified on Wed, 19 Jul 2023 at 03:52 PM

Clouds are a problem for all optical imagery.  As you can see from the image above, there are two main problems. The first is that clouds themselves obscure the land surface, but at least they are distinctive and should be easy to find and remove.  The second is the cloud shadow. The shadow is more problematic because you can still see the ground surface below, but the solar illumination is different to the non-shadow areas, so we really want to get rid of them too.  In this article, we will explore three different methods to get rid of the clouds and produce a continuous cloud-free image.

To mask out clouds in Earth Blox use the INPUT-> FILTER-> Mask Out Clouds, block, shown below:

The basic principles of cloud removal 

Principle 1: Optical sensors can't see through clouds.

  • It is important to recognise that there is no way to remove clouds and see the ground below.  Visible and infra-red light can't pass through clouds.
  • All we can do is look at multiple images, collect as many cloud-free pixels as possible, and use them to make the final image. 
  • This means that "cloud removal" is really about removing those pixels that have the influence of cloud in them and keeping those that don't. That's why we call it "cloud masking". 
  • While big bright fluffy clouds are relatively easy to detect in an image, the thinner wispy clouds can be much harder to filter out. 

Principle 2: We can remove all images (from our analysis) that have too many clouds

  • The image output you see in Earth Blox is a result of stitching together (mosaicking) lots of different image scenes.  (The COLLATE block is what manages how these scenes are stitched together). 
  • The image suppliers tag each scene to let you know how much of each image is covered in clouds.
  • This means we can remove any image scenes that have too much cloud cover (since there is no point doing time-consuming processing to filter out clouds on a scene that will only give you a few cloud-free pixels).
  • This is the metric that the Mask Out Clouds filter block will ask you for as default. The Max Cloud Cover Percentage metric is essentially asking you, "What % of cloud cover in each scene are you willing to consider for processing?"  If you say "20" then only scenes with less than 20% cover will be used in the analysis and all the others will be discarded.  If you say "100" then every scene will be used, even the ones that are 100% cloud.
  • Depending upon the season, the geographic location and the time period you choose, it is possible that you will come across a situation where there are no scenes available that meet your % cloud criteria. If this is the case, you need to increase the % cloud cover metric or extend the time period being used to create the mosaic (so that more scenes are used for each mosaic, therefore increasing the likelihood of having a low-cloud image).

Principle 3: There are trade-offs

  • You will have to make some compromises when using optical data. (If you really can't make these compromises, then you might want to look at how to use the radar data, as they have no clouds.)
  • For most parts of the world, if you wait long enough, you will find a cloud-free pixel for every part of your area of interest. 
  • But do take into account that the longer the time period you use to find cloud-free imagery, the greater the chance that some kind of change has happened on the ground surface (so your resulting image may include adjacent pixels from very different conditions).  This means you have to think carefully about what you are looking for in the imagery. 
  • If you are studying agriculture, then using 12 months of data to create your cloud-free image may result in crop-related data for many different stages in the growing cycle.  
  • If you are looking at tropical deforestation, then a 12-month summary of forest change may be sufficient.
  • This is when your choice of Median, Mean, Max, Min or Mosaic, within the Collate Images in Time (Composite) block becomes an important consideration.
    • Median: Returns the median value for all of the cloud-free pixels collected.
    • Mean: Returns the mean value for all of the cloud-free pixels collected.
    • Max (Min): Returns the maximum (minimum) value for all of the cloud-free pixels collected.
    • Mosaic: Returns the latest pixel value all of the cloud-free pixels collected.
  • Note that Mean returns a composite value, whereas the others are all a single value from a single scene. (The median is a single value from a single scene if there are an odd number of scenes. If there is an even number of scenes, then it is the average of two values.)

Method 1: A simple cloud-free image

Step 1: Set up the basic blocks

  • This will work for a relatively cloud-free area (cloud frequency <50%). 
  • It will work for both Landsat and Sentinel 2.
  • Use a container block with filter blocks to select the dataset, the area of interest and the time period.

Step 2: Choose the composite method to maximise cloud-free pixels

  • Choose Median for the Collate Images in Time (composite) block, and choose an interval of 1 month or longer. 
  • Your workflow should now look like this:

  • Since the median is the middle value, you will get a cloud-free value as long as more than 50% of the scenes didn't have cloud in that pixel.
  • This only works for Median because:
    • If you choose Max, you will selectively pick out the (high) values that are cloud.
    • If you choose Min, you will selectively pick out the (low) values that are cloud-shadow.
    • If you choose Mean, you will get an average value over cloudy and non-cloudy pixels.
  • For areas that have frequent cloud cover, this will not work. 
  • The example result below is for Barbados for an entire year. Note how this tropical island has had enough cloud cover inland during the year that there are still clouds in this image.
  • In this case, Method 2 or 3 is required.

Method 2: Using Default Cloud Masking 

Step 1: Default cloud mask

  • You now need to include the Mask Out Clouds block, and add it to the workflow as follows:

  • In default mode, this block does two things. 
  • First, it removes all scenes with a percentage of cloud cover greater than the Max Cloud Cover Percentage and discards them so they are not used in any further analysis.  Every image scene takes time to process, and if that has too much cloud cover it is not worth the time involved.
  • Secondly, for all of the remaining image scenes, it uses dataset-specific information to remove individual pixels that have too much cloud cover.  For example:
    • When it is Landsat data that has been selected, the Landsat quality band is used to remove cloudy pixels and cloud shadow pixels. 
    • When Sentinel 2 is selected, the S2 quality data layers are used to remove cloudy pixels (but not cloud shadow). 
  • You will have to experiment with the % cover metric to see what works best for you. 
  • Here is Barbados again, but this time with the defaul cloud mask and default settings (40%).  There is a big improvement, but its still not perfect. 

Step 2: Interpret the output

  • Note that pixels that are never cloud free will not be assigned a value. 
  • These pixels will not appear on the map so you will see the background through these pixels. For example:

  • These pixels are white because that is the ROADMAP in the background. As a general rule it is better to have no data than bad data.
  • For your particular area, experiment with timescales and Max Cloud Cover Percentage values to see which is best for your application. 
  • Remember: there isn't a single cloud filter which is perfect for all time periods and all locations. 

Method 3: Using Sentinel 2 s2cloudless 

Step 1: Setting the custom criteria for S2.

  • When using Sentinel 2 the cloud mask will give you an option for a custom setting, which looks like this:

  • In this case the cloud mask uses the s2cloudless, an S2 cloud probability dataset. It works at the pixel level giving a value of the likelihood that it includes cloud.
  • S2cloudless is a machine learning algorithm used to find clouds and create an image mask.
  • Because it works at the pixel level, it can take much longer than the default setting.
  • The S2 cloud mask will work best when there are fewer clouds in an image scene, so it is worthwhile using the Filter By Attribute block to first remove all the scenes with more than 40% cloud cover.
  • The options in the block are:  
    • Cloud Probability Threshold: This is the maximum probability that a cloud exists in a pixel.  If the estimated probability of cloud is higher, the pixel is discarded.  
    • Cloud Projection Distance: This is the maximum distance (in km) from a cloud edge that the model looks for evidence of shadow.
    • NIR Dark Threshold: This helps determine areas of cloud shadow. When the NIR band gets lower than this value the model assumes it is shadow.
    • Buffer: When a cloud is identified, this is the distance (in m) around the object where pixels will be removed. 
  • Below is Barbados with this block included in the workflow using the pre-defined values (as in above image).  It clearly results in a much higher quality cloud-free image.  

Step 2: S2 Cloudless

  • S2cloudless is a machine-learning algorithm used to find clouds and create a probability that there are clouds in each pixel.
  • Earth Blox uses this information to mask out the clouds.
  • It then calculates where the shadows will be, and masks them too.
  • Because of the extra calculations involved, this method may take longer than the default.  

A final note on NDVI 

You might be wondering what happens if you use the NDVI index block without a cloud filter block.   The good news is that clouds have a very low NDVI, so if you are looking at vegetation trends (which have a very high NDVI) simply ensure you place the block before the Collate Images in Time (composite) block, and then select Max for the composite method.

This will work fine in areas of low cloud cover, or long time periods.  But it is always worthwhile adding a cloud removal block first, just in case. 

Remember: There isn't a cloud filter which is perfect for all time periods and all climatic regions. Experiment to find out which one works best for your site, and your application. 


We always track the feedback you give us so that we can improve how we help you. 

Please let us know if you found this one helpful.

Was this article helpful?

That’s Great!

Thank you for your feedback

Sorry! We couldn't be helpful

Thank you for your feedback

Let us know how can we improve this article!

Select atleast one of the reasons
CAPTCHA verification is required.

Feedback sent

We appreciate your effort and will try to fix the article