geotiff2pmtiles: How I Converted 116 GB of Satellite Data into 60M Map Tiles

geotiff2pmtiles is a pure-Go tool with zero system dependencies that converts GeoTIFF files into PMTiles archives — fast and with minimal memory.

The Problem: ESA WorldCover
#

This started with the ESA WorldCover dataset — a global 10m land cover product based on Sentinel-1 and Sentinel-2 data. The full dataset is 2,561 GeoTIFF files totaling 116 GB. I wanted to convert it into a single PMTiles archive for a web-based flight simulator. PMTiles stores all tiles in a single file served via HTTP range requests — no tile server required.

I started with rio-pmtiles, the recommended Python tool built on Rasterio and GDAL. After over a week of runtime, it still hadn’t finished. It consumed so much memory that I had to configure 512 GB of swap space just to keep it alive. So I started building a replacement in Go.

The Result
#

Two weeks of on-and-off work later, geotiff2pmtiles converts the full dataset in under 16 hours. rio-pmtiles never finished — so geotiff2pmtiles isn’t just faster, it’s the difference between getting a result and not getting one. I had a working prototype after a few hours, then spent the rest adding support for multiple projections, source formats, and resampling methods.

Most of the processing time goes to zoom level 13 — the highest resolution. At a tile size of 512 px, that’s 45 million tiles. Here’s how each zoom level breaks down:

Zoom	Tiles	Throughput	Duration
13	45,572,096	1,069/s	710m 14s
12	11,395,072	1,852/s	102m 33s
11	2,850,816	1,667/s	28m 30s
10	712,704	1,504/s	7m 53s
9	178,688	1,369/s	2m 10s
8	44,800	1,235/s	36s
7	11,264	995/s	11s
Total	60,765,440		14h 12m 7s

ESA WorldCover land cover data over Europe — Processed ESA World Cover Dataset

Auto-Detection
#

When you point geotiff2pmtiles at a GeoTIFF, it examines the file structure and GDAL metadata to automatically configure the processing pipeline:

Float GeoTIFFs (e.g. Copernicus DEM) → Terrarium encoding for elevation data
Multi-band satellite data (Sentinel-2, PlanetScope, HLS, Google Earth Engine exports) → automatic band ordering (RGB from the right bands), linear or log rescaling from the detected value range
Categorical rasters (e.g. ESA WorldCover) → mode resampling to preserve class values

The detection reads per-band DESCRIPTION and SCALE/OFFSET items from GDAL metadata (TIFF tag 42112), with fallback to dataset-level strings. For most common satellite products, you can just run geotiff2pmtiles input.tif output.pmtiles and get sensible output without specifying any flags.

Architecture
#

geotiff2pmtiles processes zoom levels top-down:

Max zoom — tiles are generated directly from Cloud Optimized GeoTIFF (COG) files via per-pixel inverse projection and resampling
Lower zooms — each level is built by downsampling 2×2 groups from the level above, avoiding redundant source I/O
Output — tiles flow into a two-pass PMTiles writer that handles deduplication, clustering, and directory construction

Memory-Mapped I/O
#

The first constraint: handle 116 GB of input without loading it all into memory. The tool uses mmap to access TIFF tiles on demand — the OS manages paging, so only the portions currently being accessed reside in physical memory. Multiple workers read from the same mmap’d file concurrently via pread(2) without locking. Windows is not supported due to the reliance on POSIX mmap semantics.

COG Tile Cache
#

Compressed TIFF tiles (JPEG, LZW, Deflate) are decoded on first access and cached as image.Image objects in a shared LRU cache, sized at concurrency × 128 entries. Cache keys are integer triplets (reader_id, ifd_level, tile_index) — switching from string keys eliminated 18% of CPU time.

Tiles are grouped into Hilbert-contiguous batches of 32 for processing. The Hilbert curve ordering keeps the 2D working set compact, so adjacent batches read from nearby source regions — improving cache hit rates compared to row-major or random ordering.

Pyramid Downsampling
#

Lower zoom levels are built by combining 2×2 child tiles rather than re-reading source COGs. For elevation data (Terrarium encoding), the downsampler decodes RGB back to elevation values, averages valid samples, and re-encodes. Supported modes: bicubic (default), Lanczos-3, bilinear, nearest-neighbor, and mode (most common value — ideal for categorical data).

Disk-Backed Tile Store
#

Encoded tiles (PNG, WebP, JPEG) are held in a hybrid memory/disk store. A dedicated I/O goroutine spills tiles to a temp file when memory usage approaches ~90% of system RAM. Uniform tiles — where every pixel is the same color — are stored as 4-byte objects and never spilled, eliminating a substantial share of disk I/O for datasets with large uniform regions.

Two-Pass PMTiles Writer
#

Pass 1 — Collection: Tiles are appended to a temp file and hashed with FNV-64a for deduplication. For datasets with many uniform tiles, this eliminates 30–50% of tile writes.

Pass 2 — Finalization: Entries are sorted by Hilbert tile-ID for clustering (enabling efficient HTTP range-request access), then rewritten in sorted order with gzip-compressed metadata. The root directory is constrained to 16 KiB to fit in the initial HTTP fetch that PMTiles clients make.

Optimizations
#

I profiled the tool in a loop — run benchmarks, analyze the profile, identify the bottleneck, apply a fix, repeat. Here’s what that process found:

Optimization	CPU before	CPU after	Impact
Integer cache keys	101s (18%)	< 1s	Eliminated string hashing
Type-specific pixel reads	57s (10%)	0s	Eliminated interface boxing
Remove madvise	48s (9%)	0s	Eliminated redundant syscalls
Per-tile source/level prep	37s (6%)	< 1s	Eliminated per-pixel recomputation
Batch work queue	—	—	Improved parallelism from 3.2× to 3.3×
Total	594s CPU, 188s wall	248s CPU, 81s wall	2.3× faster

Beyond the profiling loop, several resampling-specific optimizations made a difference:

LUT acceleration — runtime sin() and pow() calls are replaced with precomputed lookup tables for resampling kernels and gamma correction. For Lanczos-3, each output pixel evaluates 36 kernel positions — a 1,024-entry LUT eliminates those sin() calls entirely. Bicubic kernels and gamma correction use similar LUTs.
Lon/lat precomputation — since map tiles are axis-aligned, longitude varies only along x and latitude only along y. Precomputing per-axis arrays drops projection cost from O(n²) to O(n) per tile.
Fast paths by image type — specialized implementations for YCbCr, RGBA, and Gray avoid the generic image.Image interface dispatch. For Lanczos-3, a single-tile fast path skips per-pixel cache lookups when all 36 kernel positions fall within one source tile.

Where the remaining CPU time goes:

The remaining CPU time is dominated by fundamental work: page faults for reading mmap’d COG data (19%), writing output pixels (16%), and YCbCr→RGB color conversion (14%). These represent the actual cost of decoding JPEG tiles and compositing output images — not overhead.

Source Code
#

The project is open source: pspoerri/geotiff2pmtiles. Pre-built binaries for Linux and macOS (amd64/arm64) are on the releases page.

If you’re working with large raster datasets and PMTiles, give it a try — I’d welcome feedback and contributions.

The Problem: ESA WorldCover#

The Result#

Auto-Detection#

Architecture#

Memory-Mapped I/O#

COG Tile Cache#

Pyramid Downsampling#

Disk-Backed Tile Store#

Two-Pass PMTiles Writer#

Optimizations#

Source Code#