Osman Ibrahim
All Articles
Accuracy AssessmentJanuary 8, 20257 min read

Accuracy Assessment for Land Cover Maps: The Olofsson (2014) Method

Most land cover accuracy assessments report inflated overall accuracy. The Olofsson et al. (2014) framework corrects for sampling bias and provides unbiased area estimates with confidence intervals — here's how to implement it.

Accuracy AssessmentLand CoverStatisticsQGISRemote Sensing
O

Osman Ibrahim

M.Sc. Geomatics · Remote Sensing & GIS Expert

Share:LinkedInX / Twitter

The Problem with Simple Accuracy Assessment

Standard confusion matrix accuracy is computed by dividing the number of correctly classified pixels by the total number of sample points. This sounds reasonable, but it has a critical flaw: it ignores the proportional area of each class.

If your map has 90% fallow land and 10% wheat, and you randomly sample 100 points, you'll get ~90 fallow samples and ~10 wheat samples. A classifier that labels everything as fallow achieves 90% overall accuracy while being completely useless for the minority class.

Olofsson et al. (2014) — published in Remote Sensing of Environment — provides a statistically rigorous framework for:

  • Unbiased accuracy estimation using area-weighted metrics
  • Unbiased area estimation with confidence intervals
  • Stratified random sampling design

The Framework in 3 Steps

Step 1 — Stratified Sampling Design

Allocate sample points proportional to the mapped area of each class, with a minimum of 50 samples per stratum for reliable estimates:

import numpy as np

def compute_sample_allocation(class_areas_ha, total_samples=500, min_per_class=50):
    """
    Proportional allocation with minimum per class.
    class_areas_ha: dict {class_name: area_in_ha}
    """
    total_area = sum(class_areas_ha.values())
    proportions = {k: v / total_area for k, v in class_areas_ha.items()}
    
    # Proportional allocation
    allocation = {k: max(int(p * total_samples), min_per_class) 
                  for k, p in proportions.items()}
    
    return allocation, proportions

# Example: Gezira Scheme land cover
class_areas = {
    'Wheat': 420_000,
    'Other Crops': 180_000,
    'Fallow': 1_200_000,
    'Water': 20_000,
}

allocation, proportions = compute_sample_allocation(class_areas)
print("Sample allocation:", allocation)
# {'Wheat': 116, 'Other Crops': 50, 'Fallow': 334, 'Water': 50}

Step 2 — Compute the Error Matrix with Area Weights

The key insight is weighting each cell of the confusion matrix by the proportion of the map that each class occupies:

def compute_weighted_error_matrix(confusion_matrix, class_proportions):
    """
    Converts a count-based confusion matrix to an area-weighted matrix.
    
    confusion_matrix: numpy array, rows=reference, cols=map
    class_proportions: array of mapped area proportions (must sum to 1)
    """
    n_i = confusion_matrix.sum(axis=1)  # samples per stratum
    
    # Weighted matrix: p_ij = W_i * (n_ij / n_i)
    W = np.array(class_proportions)
    weighted = np.zeros_like(confusion_matrix, dtype=float)
    
    for i in range(len(W)):
        for j in range(len(W)):
            weighted[i, j] = W[i] * (confusion_matrix[i, j] / n_i[i])
    
    return weighted


# Example confusion matrix (rows=reference class, cols=mapped class)
# Classes: Wheat, Other Crops, Fallow, Water
cm = np.array([
    [108, 4,  3,  1],   # reference Wheat
    [3,  44,  2,  1],   # reference Other Crops
    [5,  2,  322,  5],  # reference Fallow
    [0,  0,   3, 47],   # reference Water
])

props = list(proportions.values())
weighted_cm = compute_weighted_error_matrix(cm, props)

Step 3 — Compute Unbiased Metrics

def compute_olofsson_metrics(weighted_cm):
    """
    Compute accuracy metrics following Olofsson et al. (2014).
    Returns overall accuracy, user's accuracy, producer's accuracy,
    and unbiased area proportions with standard errors.
    """
    p_ij = weighted_cm
    
    # Overall accuracy
    OA = np.trace(p_ij)
    
    # User's accuracy (precision) — row perspective
    UA = {j: p_ij[j, j] / p_ij[j, :].sum() for j in range(p_ij.shape[0])}
    
    # Producer's accuracy (recall) — column perspective
    PA = {j: p_ij[j, j] / p_ij[:, j].sum() for j in range(p_ij.shape[1])}
    
    # Unbiased area proportion
    p_j = p_ij.sum(axis=0)
    
    return {
        'overall_accuracy': OA,
        'users_accuracy': UA,
        'producers_accuracy': PA,
        'area_proportions': p_j,
    }

results = compute_olofsson_metrics(weighted_cm)
print(f"Overall Accuracy: {results['overall_accuracy']:.3f}")

Standard Errors and Confidence Intervals

The framework also provides variance estimators for each metric. For overall accuracy:

def standard_error_OA(weighted_cm, sample_counts):
    """
    Equation 5 from Olofsson et al. (2014).
    """
    n_i = np.array([sample_counts[k] for k in sorted(sample_counts)])
    W_i = weighted_cm.sum(axis=1)
    
    # q_i = proportion of stratum i that is correctly classified
    q_i = np.diag(weighted_cm) / W_i
    
    # Variance of OA
    var_OA = np.sum(W_i**2 * (q_i * (1 - q_i)) / (n_i - 1))
    
    return np.sqrt(var_OA)

se = standard_error_OA(weighted_cm, allocation)
OA = results['overall_accuracy']
print(f"OA: {OA:.3f} ± {1.96 * se:.3f} (95% CI)")

My GeoAccuRate Plugin

Implementing this by hand is error-prone, especially the variance calculations. I built GeoAccuRate — a QGIS plugin that automates the entire Olofsson (2014) workflow:

  1. Load your classified raster and reference sample shapefile
  2. The plugin computes the weighted confusion matrix automatically
  3. Outputs: Overall accuracy, User's/Producer's accuracy, area estimates, and standard errors — all in a formatted report

Install it from the QGIS Plugin Repository (search "GeoAccuRate") or from GitHub.

Why This Matters

In the Gezira Scheme wheat mapping project, the simple accuracy was 94.2% but the Olofsson-corrected accuracy was 92.8% — a 1.4 percentage point difference that matters when reporting to FAO. More importantly, the area estimates shifted by up to 8% for minority classes, which directly affects the reported irrigated crop area statistics.

If you're producing land cover maps for policy or resource management decisions, please use this framework. The additional computation is minimal and the statistical validity is essential.

Share:LinkedInX / Twitter
O

Osman Ibrahim

Remote Sensing & GIS Expert · M.Sc. Geomatics Engineering, KTU Turkey

Geomatics Engineer with 8+ years applying satellite data to water management, crop monitoring, and hydrology across Sudan and the Near East. Works with FAO, IFAD, and UNESCO. Author of WaPOR Water Productivity and GeoAccuRate QGIS plugins.

View all articles