Powered by OpenAIRE graph
Found an issue? Give us feedback
ZENODOarrow_drop_down
ZENODO
Dataset . 2026
License: CC BY
Data sources: Datacite
ZENODO
Dataset . 2026
License: CC BY
Data sources: Datacite
versions View all 2 versions
addClaim

ARAS400k: A Large-Scale Remote Sensing Dataset Augmented with Synthetic Data for Segmentation and Captioning

Grounding Synthetic Data Generation With Vision and Language Models
Authors: Çağlar, Ümit Mert; Temizel, Alptekin;

ARAS400k: A Large-Scale Remote Sensing Dataset Augmented with Synthetic Data for Segmentation and Captioning

Abstract

ARAS400k We introduce the fully open-sourced ARAS400k, a comprehensive remote sensing dataset consisting of 100,240 real images and 300,000 synthetic images. Each image is 256x256 and paired with semantic segmentation maps and 5 descriptive captions. ARAS400k contains 400,240 images and 2,001,200 descriptive captions. Subset True Color Images Segmentation Maps Captions Train 80,192 80,192 400,960 Validation 10,024 10,024 50,120 Test 10,024 10,024 50,120 Real Total 100,240 100,240 501,200 Synthetic 300,000 300,000 1,500,000 ARAS400k Total 400,240 400,240 2,001,200 ARAS400k folder contents Subset (train,val,test,synth) images 0000.png 0001.png ... masks 0000.png 0001.png ... captions.csv An example from synth folder, captions.csv containing filename (unique) split (train-val-test-synth), class percentages (Tree, Shrub, Grass, Crop, Built-up, Barren, Water) and captions for each method: hybrid_gemma3-4b, hybrid_qwen3-vl-8b, text_qwen3-4b, vision_gemma3-4b, vision_qwen3-vl-8b filename split Tree, Shrub, Grass, Crop, Built-up, Barren, Water hybrid_gemma3-4b hybrid_qwen3-vl-8b text_qwen3-4b vision_gemma3-4b vision_qwen3-vl-8b 0000.png synth 0,0,18,82,0,0,0 The image depicts a landscape dominated by cultivated crops (82%), interspersed with smaller areas of grass (18%). The scene exhibits a patterned arrangement of fields, suggesting agricultural land use with some topographic variation. The scene is dominated by crop fields, covering 82% of the area in a patchwork of geometrically shaped agricultural plots, with grasslands making up the remaining 18% in interspersed, smaller areas. The scene is primarily agricultural, with 82% of the area covered by crops, indicating a dominant land use of cultivated fields. Grass covers a small portion (18%), suggesting limited natural or pasture land use. The image depicts a landscape dominated by agricultural fields, likely cultivated with crops, arranged in a regular grid pattern. A prominent ridge or elevated area runs through the center, suggesting a hilly or undulating terrain. This satellite image shows a patchwork of agricultural fields and rural land use, characterized by geometrically shaped plots in varying shades of brown and tan, likely indicating different crops or soil types. A prominent, lighter-toned linear feature—possibly a river, canal, or road—cuts through the landscape, serving as a key geographical element. 0001.png synth 9,2,88,1,0,0,0 The image depicts a landscape dominated by extensive grassland (88%), interspersed with scattered trees (9%) and a small area of crops (1%). A light-colored road and a network of smaller paths cut through the terrain, suggesting a pattern of human access and potentially agricultural activity within the predominantly grassy area. The scene is dominated by grasslands (88%), interspersed with small patches of tree cover (9%) and shrubs (2%), with sparse, fragmented crop areas (1%) suggesting a rural or semi-arid landscape with limited agricultural activity. The scene is predominantly grassland, with 88% coverage, indicating a large area of natural or managed grass cover, with minimal tree, shrub, or crop presence. The image depicts a hilly landscape with a network of agricultural fields, primarily used for pasture or cultivation, intersected by a road and smaller dirt tracks. A meandering stream and forested areas are also visible, suggesting a varied terrain with a mix of human and natural land use This satellite image shows a patchwork of agricultural fields and forested areas, with a prominent river or stream winding through the landscape and a straight road cutting across the terrain, indicating human land use and infrastructure within a rural, possibly hilly, region. Created synth data (300k images) using only train subset (80k images), validation and test set remains unknown (no leakage) for the synth subset. Code Repository Python scripts for collecting and preparing data, training and inference models are available here. 

Related Organizations
Keywords

Image processing, Image Captioning, Deep learning, Semantic Segmentation, Remote sensing, Synthetic Data Generation

  • BIP!
    Impact byBIP!
    selected citations
    These citations are derived from selected sources.
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    0
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
Powered by OpenAIRE graph
Found an issue? Give us feedback
selected citations
These citations are derived from selected sources.
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
0
Average
Average
Average