🌾 Paddy Field Instance Segmentation using Multi-Temporal SAR Time Series

📖 Project Overview

Monitoring rice paddy fields is critical for food security, water management, and methane emission tracking. However, traditional optical satellite imagery (like Sentinel-2) is often unusable in tropical and subtropical regions due to persistent cloud cover during the monsoon cropping season.

This project implements an end-to-end Deep Learning pipeline to automatically segment paddy fields using Sentinel-1 SAR (Synthetic Aperture Radar) Ground Range Detected (GRD) data. By leveraging the unique backscatter temporal signature of rice during its flooding and growth stages, the model achieves high-precision mapping regardless of weather conditions.

🛠️ Key Technical Features

Multi-Temporal Fusion: Processes a 60-band data stack representing a 20-date cropping season. Each date includes VV, VH, and VV/VH ratio polarizations to capture the "double-bounce" scattering effect characteristic of rice stems in water.
Geospatial Preprocessing: Integration of ESA SNAP (pyroSAR) for specialized SAR calibration, thermal noise removal, and terrain correction (orthorectification).
MLOps Architecture:
- DVC (Data Version Control): Manages heavy 60-band GeoTIFF stacks and model weights without bloating the Git repository.
- MLflow: Tracks hyperparameter experiments, loss curves, and evaluation metrics (IoU, F1-Score).
High Performance: Optimized a U-Net + ResNet34 architecture to achieve a 0.85 IoU, successfully filtering "speckle noise" inherent in SAR data.

🛰️ Data Sources

Sentinel-1 SAR: Multi-temporal C-band data (20 dates).
JAXA LULC Map: Used for automated ground-truth label generation (Rice vs. Non-Rice).
ROI: Niigata, Japan (High-intensity rice production region).

Installation & Setup

1. Prerequisites

Python 3.10 or 3.11
NVIDIA GPU with CUDA support (Recommended)

2. Environment Setup

Construct a virtual environment and install the required dependencies:

python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
pip install -r requirements.txt

3. Critical Windows Fix (DLL Error)

On Windows systems, you might encounter OSError: [WinError 1114] when importing torch. This is resolved by manually loading c10.dll. The following snippet is included in the project scripts:

import os
import platform
import ctypes
from importlib.util import find_spec

if platform.system() == "Windows":
    try:
        if (spec := find_spec("torch")) and spec.origin:
            dll_path = os.path.join(os.path.dirname(spec.origin), "lib", "c10.dll")
            if os.path.exists(dll_path):
                ctypes.CDLL(os.path.normpath(dll_path))
    except Exception:
        pass

4. Dependency Versions

We use specific stable versions to ensure compatibility on Windows:

torch==2.5.1+cu121
torchvision==0.20.1+cu121
torchgeo==0.6.2

Project Structure

src/: Source code for the pipeline stages.
- training.py: Handles tiling and model training.
- testing.py: Runs inference on test images and generates visualizations.
- labeling.py: Prepares binary labels from external data.
data/: Data directory (standardized structure).
- raw/: Unprocessed input data.
- processed/: Features, stacks, and labels.
- external/: External shapefiles and validation data.
config.yaml: Centralized configuration for paths and training parameters.
dvc.yaml: DVC pipeline definition.

Pipeline Flow (DVC)

The project uses DVC to manage the workflow:

Prepare Labels: python src/labeling.py
- Merges source TIFs and creates binary masks/geojson.
Train Model: python src/training.py
- Generates tiles (patches) from input rasters and labels.
- Trains the segmentation model.
Inference (Experimental): python src/testing.py
- Performs semantic segmentation on test images.
- Orthogonalizes results and generates split-map visualizations.

Configuration

Modify config.yaml to adjust training parameters:

tile_size: Size of the input patches (default: 512).
epochs: Number of training iterations.
batch_size: Number of samples per training step.

Data Storage

By default, large data files and models are stored in:

Project root (data/, models/)
External drive (if configured in training.py, e.g., G:\data)

Usage

To run the full pipeline through DVC:

dvc repro

To run individual steps:

python src/training.py
python src/testing.py

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
.dvc		.dvc
.idea		.idea
models		models
src		src
.dvcignore		.dvcignore
.gitignore		.gitignore
README.md		README.md
config.yaml		config.yaml
dvc.lock		dvc.lock
dvc.yaml		dvc.yaml
prediction_visualization.ipynb		prediction_visualization.ipynb
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🌾 Paddy Field Instance Segmentation using Multi-Temporal SAR Time Series

📖 Project Overview

🛠️ Key Technical Features

🛰️ Data Sources

Installation & Setup

1. Prerequisites

2. Environment Setup

3. Critical Windows Fix (DLL Error)

4. Dependency Versions

Project Structure

Pipeline Flow (DVC)

Configuration

Data Storage

Usage

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🌾 Paddy Field Instance Segmentation using Multi-Temporal SAR Time Series

📖 Project Overview

🛠️ Key Technical Features

🛰️ Data Sources

Installation & Setup

1. Prerequisites

2. Environment Setup

3. Critical Windows Fix (DLL Error)

4. Dependency Versions

Project Structure

Pipeline Flow (DVC)

Configuration

Data Storage

Usage

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages