Guide to Land Cover Classification using Google Earth Engine

Soumyadarshani Dash 10 Jul, 2024
9 min read


Land segmentation is significant in farther detecting and geological data frameworks (GIS) for analyzing and classifying diverse arrive cover sorts in partisan symbolism. This direct will walk you through making a arrive division demonstrate utilizing Google Soil Motor (GEE) and joining it with Python for upgraded usefulness. By the conclusion of this direct, you’ll get it how to stack adj. symbolism, prepare it, and apply machine learning procedures for arrive cover classification.

Guide to Land Cover Classification using Google Earth Engine and Python

Learning Objective

  • Understand how to set up and authenticate the Google Earth Engine (GEE) API for geospatial analysis.
  • Learn to retrieve and preprocess satellite imagery, including cloud masking, using GEE.
  • Gain the ability to calculate the Normalized Difference Vegetation Index (NDVI) for assessing vegetation health.
  • Acquire skills in preparing training data and applying k-means clustering for land cover classification.
  • Develop proficiency in visualizing geospatial data and classification results using Folium.
  • Implement error handling to ensure the reliability and robustness of satellite imagery processing code.

This article was published as a part of the Data Science Blogathon.

Introduction to Google Earth Engine

Google Soil Motor may be a cloud-based stage for planetary-scale natural information investigation. It combines a multi-petabyte catalog of toady symbolism and geospatial datasets with effective preparing capabilities. GEE is broadly utilized for inaccessible detecting errands like arrive division due to its vigorous preparing capacities and broad information library.

In this guide, we’ll walk through the process of land cover classification using Landsat imagery and GEE in Python. We’ll classify land cover into different classes using k-means clustering. Here’s what we’ll cover:

  • Setting up Google Earth Engine
  • Retrieving and Preprocessing Satellite Imagery
  • Cloud Masking
  • Calculating NDVI (Normalized Difference Vegetation Index)
  • Training Data Preparation
  • K-Means Clustering for Land Cover Classification
  • Visualization

Google Earth Engine provides all the data used in this model.

Setting Up Your Environment

First, install the Earth Engine API and authenticate your account using the following code:

# Install and Import the Earth Engine API
!pip install earthengine-api

import ee
import folium

# Authenticate and initialize with specific project

The Earth Engine API could be a capable geospatial investigation stage created by Google, providing access to a endless file of toady symbolism and geospatial datasets. It allows users to perform large-scale processing and analysis of remote sensing data using Google’s infrastructure.

Setting Up Your Environment

This pop-up warns that any resources created using the API may be deleted if the API is disabled, and all code utilizing this project’s credentials to call the Google Earth Engine API will fail.

The background displays detailed metrics for various methods, including ListAlgorithms, ListOperations, ListAssets, and CreateMap, with their respective request counts, errors, and average latencies. The data indicates low usage and error rates, with latencies generally under half a second, except for CreateMap, which has a higher average latency of 1.038 seconds.

Google Earth Engine

The “APIs & Services” dashboard on the Google Cloud Platform provides an overview of the API’s traffic, errors, and latency. According to the dashboard, there were 64 requests made to the Google Earth Engine API, with a 10.94% error rate, equating to 7 errors. The median latency stands at 229 milliseconds, while the 95th percentile latency reaches up to 2.656 seconds, indicating some variability in response times. The traffic and error graphs illustrate peaks at specific times, suggesting periods of higher activity or potential issues.

The Earth Engine API could be a capable instrument that empowers the checking of different natural variables, such as activity, vegetation wellbeing, and arrive cover changes, utilizing partisan symbolism and geospatial information. This capability enables clients to analyze and track energetic wonders on Earth’s surface over time, giving basic experiences for natural checking and administration.

Retrieving and Preprocessing Satellite Imagery

Define your Area of Interest (AOI) and fetch Landsat imagery:

aoi = ee.Geometry.Rectangle([-73.96, 40.69, -73.92, 40.71])

# Fetch Landsat imagery
landsat = ee.ImageCollection('LANDSAT/LC08/C01/T1_SR') \
    .filterBounds(aoi) \
    .filterDate('2020-01-01', '2024-05-30')

We utilize Landsat 8 symbolism from the LANDSAT/LC08/C01/T1_SR dataset. Landsat 8, propelled in 2013, may be an adherent overseen jointly by NASA and the U.S. Topographical Overview (USGS). It carries two sensors: the Operational Arrive Imager (OLI), which captures information in nine unearthly groups counting obvious, near-infrared, and shortwave infrared, and the Warm Infrared Sensor (TIRS), which captures information in two warm groups.

This dataset contains climatically adjusted surface reflectance and land surface temperature inferred from the information delivered by these sensors.

  • Band 2 (Blue)
  • Band 3 (Green)
  • Band 4 (Red)
  • Band 5 (Near Infrared, NIR)
  • Band 6 (Shortwave Infrared 1, SWIR1)
  • Band 7 (Shortwave Infrared 2, SWIR2)

These bands are crucial for various remote sensing applications, including accurate assessment of different land cover types, cloud masking, and calculation of indices like NDVI for vegetation analysis. The combination of these unearthly groups empowers comprehensive inaccessible detecting investigation, fundamental for precise arrive cover classification and vegetation assessment.



Cloud Masking

Cloud masking is the method of distinguishing and expelling clouds and their shadows from adj. pictures to guarantee clearer and more precise investigation.

Create a function to mask clouds and apply it to the image collection:

def maskL8sr(image):
    cloudShadowBitMask = (1 << 3)
    cloudsBitMask = (1 << 5)
    qa ='pixel_qa')
    mask = qa.bitwiseAnd(cloudShadowBitMask).eq(0).And(
    return image.updateMask(mask)

# Apply cloud masking function to the image collection
landsat =

In remote sensing, clouds can cloud the Earth’s surface, driving to wrong information elucidation. By applying cloud masking, we filter out these unwanted elements, allowing us to focus on the actual land features and perform precise tasks like land segmentation.

In our project, cloud masking is crucial because it helps eliminate interference from clouds, ensuring that our analysis and classification of land cover types are based on reliable and unobstructed imagery.

Cloud masking

We create a function to mask clouds using the pixel quality attributes from the Landsat 8 images and apply this function to the entire image collection to ensure clearer, more accurate analysis. This step is essential for removing cloud and cloud shadow interference in our land cover classification process.

Calculating NDVI

Calculate NDVI for each image in the collection:

median_landsat = landsat.median()
ndvi = median_landsat.normalizedDifference(['B5', 'B4']).rename('NDVI')
median_landsat_with_ndvi = median_landsat.addBands(ndvi)

We calculate the Normalized Contrast Vegetation Record (NDVI) for each picture within the collection utilizing the near-infrared (NIR) and red bands. NDVI may be a key marker of vegetation well-being and thickness, and it is calculated as follows:



The Normalized Distinction Vegetation File (NDVI) may be a key pointer of vegetation health and thickness. It is calculated utilizing the reflectance values within the near-infrared (NIR) and ruddy groups of disciple symbolism.

  • NIR is the reflectance in the near-infrared band (Band 5 for Landsat 8).
  • Red is the reflectance in the red band (Band 4 for Landsat 8).

This list makes a difference recognize vegetated regions from non-vegetated zones in our arrive cover classification.


NDVI makes a difference recognize vegetated zones from non-vegetated ones. Higher NDVI values indicate more advantageous vegetation, which helps in precisely classifying arrive cover sorts, particularly in recognizing between vegetation and urban or fruitless regions.

The advent of NDVI changed all that by enabling the use of satellite data to provide consistent, reliable, and expansive insights into the Earth’s vegetative landscapes.

Training Data Preparation

Prepare training data by sampling pixels from the image:

training =['B4', 'B3', 'B2', 'NDVI']).sample({
    'region': aoi,
    'scale': 30,
    'numPixels': 1000

Prepare the training data by sampling pixels from the image. We select specific bands and calculate NDVI for each pixel, then sample these values over the defined AOI. This process involves extracting a representative set of pixels, which are used to train our clustering algorithm for land cover classification. The training data includes a specified number of pixels, ensuring a robust dataset for accurate model training.

K-Means Clustering for Land Cover Classification

Perform k-means clustering on the training data:

num_clusters = 5
clusterer = ee.Clusterer.wekaKMeans(num_clusters).train(training)
result = median_landsat_with_ndvi.cluster(clusterer)

Perform k-means clustering on the training data to classify land cover types. This involves using the extracted pixel values, including the spectral bands and calculated NDVI, as input features for the clustering algorithm. K-means clustering groups the pixels into a specified number of clusters based on their spectral similarities,. Allowing us to categorize different land cover types such as urban areas, vegetation, water bodies, bare soil, and mixed land cover areas. This unsupervised machine learning technique helps identify distinct land cover classes without prior label information.


Visualize the original and clustered images using Folium:

# Visualization of original image with NDVI
map_before = folium.Map(location=[40.70, -73.94], zoom_start=12)

vis_params_before = {
    'bands': ['B4', 'B3', 'B2'],
    'min': 0,
    'max': 3000,
    'gamma': 1.4

map_before.add_ee_layer(median_landsat_with_ndvi, vis_params_before, 'Median Image with NDVI')

New York

Google Earth Engine
# Visualization of clustered image
map_after = folium.Map(location=[40.70, -73.94], zoom_start=12)

vis_params_after = {
    'min': 0,
    'max': num_clusters - 1,
    'palette': ['red', 'green', 'blue', 'orange', 'gray']

map_after.add_ee_layer(result, vis_params_after, 'Clustered Image')
Google Earth Engine

The color palette used in our land cover classification model assigns specific colors to different land cover types:

  • Red often represents urban or built-up areas due to their high reflectance in the visible red band, making it easy to identify high-density regions like cities or towns.
  • Green typically indicates vegetation, such as forests, grasslands, and agricultural fields, which have high reflectance in the near-infrared band and high NDVI values.
  • Blue is commonly used to depict water bodies, including rivers, lakes, and oceans, as water has low reflectance in most bands.
  • Orange represents bare soil or sparse vegetation, characterized by moderate reflectance in visible bands and lower NDVI values compared to dense vegetation.
  • Gray is used for areas not easily classified into other categories, such as mixed land cover types, shadowed regions, or barren lands with very low vegetation cover.

Error Handling

Adding error handling to the code makes it more robust and reliable:

    # Code for retrieving and processing satellite imagery
    median_landsat = landsat.median()
    ndvi = median_landsat.normalizedDifference(['B5', 'B4']).rename('NDVI')
    median_landsat_with_ndvi = median_landsat.addBands(ndvi)
except Exception as e:
    print(f"An error occurred: {e}")
Google Earth Engine

We also applied the same land cover classification model to the San Francisco area to evaluate its effectiveness in a different urban environment. Using the same process of retrieving Landsat imagery, cloud masking, NDVI calculation, and k-means clustering. We classified the land cover into five distinct types.

Google Earth Engine

 The resulting map shows a clear distinction between urban areas, vegetation, water bodies, bare soil, and mixed areas, demonstrating the model’s ability to segment diverse land cover types accurately. Below is the output image for San Francisco:

Future Applications

This land segmentation model can extend and improve in several ways, providing solutions for various future challenges.

  • Environmental Monitoring: Continuously monitor changes in vegetation health, urban expansion, and water bodies.
  • Disaster Management: Assess damage from natural disasters like floods and wildfires by comparing pre-and post-event imagery.
  • Agricultural Planning: Monitor crop health and predict yields using vegetation indices.
  • Urban Planning: Analyze land use changes and plan sustainable urban expansion.
  • Climate Change Studies: Track long-term changes in land cover and their correlation with climate data.

By leveraging Google Earth Engine’s information handling capabilities and joining with Python. It able to construct vigorous models to address these challenges, giving important bits of knowledge to analysts, policymakers, and organizers.


This guide has walked you through the process of land cover classification using Google Earth Engine and Python. By retrieving and preprocessing satellite imagery, applying cloud masking, calculating NDVI, preparing training data, and using k-means clustering, we’ve classified land cover types in both New York and San Francisco. This methodology applies to various other regions and datasets, enabling the analysis of land cover changes, environmental monitoring, and urban planning. It allows for the classification of different land cover types and provides valuable insights into spatial patterns and dynamics.

Key Takeaways

  • The arrive, division, shows bolsters natural checking, catastrophe administration, agrarian arranging, urban arranging, and climate alter ponders.
  • GEE provides a cloud-based stage for getting to and preparing huge volumes of today symbolism and geospatial information.
  • You can adjust the land cover classification strategy for different regions and datasets by modifying parameters such as the region of interest and date ranges.
  • NDVI distinguishes healthy vegetation from other land cover types, crucial for accurate classification and monitoring.
  • Combining GEE with Python enhances the development of robust land cover classification models, offering valuable insights for various stakeholders.

Frequently Asked Questions

Q1. What arrives division, and why is it critical?

A. Arrive, division, also known as arrive cover classification, includes isolating a geological region into fragments based on arrive cover sorts such as vegetation, urban regions, water bodies, and uncovered soil. This preparation is pivotal for natural observation, urban arranging, farming, and calamity administration. It makes a difference in understanding arrive utilize designs, following changes over time, and making educated choices for economic advancement.

Q2. How does Google Earth Engine facilitate land segmentation?

A. GEE provides a cloud-based platform with extensive disciple symbol and geospatial datasets. This enables efficient large-scale analyses for complex land segmentation tasks.

Q3. How is NDVI used in land segmentation?

A. The NDVI could be a key marker of vegetation well-being and thickness. It is calculated utilizing the reflectance values within the near-infrared (NIR) and ruddy groups of adj. symbolism. In the arrival division, NDVI makes a difference in recognizing vegetated regions from non-vegetated ones. Higher NDVI values demonstrate more advantageous vegetation, which helps in precisely classifying arrival cover sorts, particularly in recognizing between vegetation and urban or desolate zones.

The media shown in this article is not owned by Analytics Vidhya and is used at the Author’s discretion.

Frequently Asked Questions

Lorem ipsum dolor sit amet, consectetur adipiscing elit,

Responses From Readers