Embeddings¶
2026¶
-
Streaming millions of TESSERA tiles over HTTP with Zarr v3: Describes how TESSERA's geospatial embedding system was restructured from millions of individual NumPy files into sharded Zarr v3 stores per year, enabling efficient HTTP range requests for single-pixel to regional data retrieval with xarray/dask compatibility. [Keywords:
TESSERAZarrembeddingsHTTPgeospatialxarraydaskcloud native] -
The Technical Debt of Earth Embedding Products: Examines fragmentation and interoperability challenges in Earth embedding products, arguing that standardizing how embeddings are distributed, stored, and accessed is the real bottleneck for geospatial foundation models. [Keywords:
embeddingsgeospatialfoundation modelsinteroperabilitytechnical debtcloud native]
2025¶
-
GeoVibes: A geospatial tool for evaluating embedding models through interactive similarity search, using geoparquet and Python for nearest-neighbor queries and binary classifier training with spatial cross-validation. [Keywords:
embeddingsgeospatialsimilarity searchgeoparquetPythonclassification] -
SkyScript: A large, semantically diverse image-text dataset for remote sensing containing 5.2 million image-text pairs with 29,000+ semantic tags, designed for vision-language model (CLIP) development. [Keywords:
VLMCLIPsatellite imagerytextremote sensingembeddingsdataset] -
Scalable Geospatial Data Generation Using AlphaEarth Foundations Model: Paper on using AlphaEarth foundation model embeddings for transfer learning in forest monitoring applications. [Keywords:
foundation modelembeddingsAlphaEarthforesttransfer learning] -
TESSERA: Temporal Embeddings of Surface Spectra for Earth Representation and Analysis: Foundation model paper generating 128-dimensional embeddings from satellite time-series for land classification and canopy height prediction at 10-meter global resolution. [Keywords:
foundation modelembeddingstime seriesSentinel-2land classificationcanopy height] -
TESSERA GitHub: Open-source implementation of the TESSERA foundation model that processes satellite time-series imagery to generate embeddings for Earth observation tasks. [Keywords:
foundation modelembeddingssatellitePythonopen source] -
What Do Embeddings Actually Encode in Earth Observation Foundation Models?: LinkedIn post discussing what semantic information EO foundation model embeddings actually capture. [Keywords:
embeddingsfoundation modelsEarth observationsemantics] -
Air Quality Using Satellite Embedding: Preprint on using satellite-derived embeddings for air quality estimation and monitoring. [Keywords:
air qualitysatelliteembeddingsremote sensing] -
Text Embeddings for Semantic Search with Overture: Research on text embedding-based semantic search over Overture Maps places dataset. [Keywords:
embeddingssemantic searchOverture MapsNLPgeospatial] -
OSM Embeddings - SRAI: SRAI (Spatial Representations for AI) Python library for geospatial machine learning on vector geometries, enabling spatial data download, regionalization, and vector embeddings for ML tasks. [Keywords:
OSMembeddingsspatial AIPythongeospatial ML]
Earlier¶
- AlphaEarthFire: AlphaEarth × MODIS burn dataset builder and model trainer using AEF embeddings to model slow fire variables and predict forest fires. [Keywords:
embeddingsfire predictionMODISAlphaEarthfoundation modelPython]