
ARAS400k We introduce the fully open-sourced ARAS400k, a comprehensive remote sensing dataset consisting of 100,240 real images and 300,000 synthetic images. Each image is 256x256 and paired with semantic segmentation maps and 5 descriptive captions. ARAS400k contains 400,240 images and 2,001,200 descriptive captions. Subset True Color Images Segmentation Maps Captions Train 80,192 80,192 400,960 Validation 10,024 10,024 50,120 Test 10,024 10,024 50,120 Real Total 100,240 100,240 501,200 Synthetic 300,000 300,000 1,500,000 ARAS400k Total 400,240 400,240 2,001,200 ARAS400k folder contents Subset (train,val,test,synth) images 0000.png 0001.png ... masks 0000.png 0001.png ... captions.csv An example from synth folder, captions.csv containing filename (unique) split (train-val-test-synth), class percentages (Tree, Shrub, Grass, Crop, Built-up, Barren, Water) and captions for each method: hybrid_gemma3-4b, hybrid_qwen3-vl-8b, text_qwen3-4b, vision_gemma3-4b, vision_qwen3-vl-8b filename split Tree, Shrub, Grass, Crop, Built-up, Barren, Water hybrid_gemma3-4b hybrid_qwen3-vl-8b text_qwen3-4b vision_gemma3-4b vision_qwen3-vl-8b 0000.png synth 0,0,18,82,0,0,0 The image depicts a landscape dominated by cultivated crops (82%), interspersed with smaller areas of grass (18%). The scene exhibits a patterned arrangement of fields, suggesting agricultural land use with some topographic variation. The scene is dominated by crop fields, covering 82% of the area in a patchwork of geometrically shaped agricultural plots, with grasslands making up the remaining 18% in interspersed, smaller areas. The scene is primarily agricultural, with 82% of the area covered by crops, indicating a dominant land use of cultivated fields. Grass covers a small portion (18%), suggesting limited natural or pasture land use. The image depicts a landscape dominated by agricultural fields, likely cultivated with crops, arranged in a regular grid pattern. A prominent ridge or elevated area runs through the center, suggesting a hilly or undulating terrain. This satellite image shows a patchwork of agricultural fields and rural land use, characterized by geometrically shaped plots in varying shades of brown and tan, likely indicating different crops or soil types. A prominent, lighter-toned linear feature—possibly a river, canal, or road—cuts through the landscape, serving as a key geographical element. 0001.png synth 9,2,88,1,0,0,0 The image depicts a landscape dominated by extensive grassland (88%), interspersed with scattered trees (9%) and a small area of crops (1%). A light-colored road and a network of smaller paths cut through the terrain, suggesting a pattern of human access and potentially agricultural activity within the predominantly grassy area. The scene is dominated by grasslands (88%), interspersed with small patches of tree cover (9%) and shrubs (2%), with sparse, fragmented crop areas (1%) suggesting a rural or semi-arid landscape with limited agricultural activity. The scene is predominantly grassland, with 88% coverage, indicating a large area of natural or managed grass cover, with minimal tree, shrub, or crop presence. The image depicts a hilly landscape with a network of agricultural fields, primarily used for pasture or cultivation, intersected by a road and smaller dirt tracks. A meandering stream and forested areas are also visible, suggesting a varied terrain with a mix of human and natural land use This satellite image shows a patchwork of agricultural fields and forested areas, with a prominent river or stream winding through the landscape and a straight road cutting across the terrain, indicating human land use and infrastructure within a rural, possibly hilly, region. Created synth data (300k images) using only train subset (80k images), validation and test set remains unknown (no leakage) for the synth subset. Code Repository Python scripts for collecting and preparing data, training and inference models are available here.
Image processing, Image Captioning, Deep learning, Semantic Segmentation, Remote sensing, Synthetic Data Generation
Image processing, Image Captioning, Deep learning, Semantic Segmentation, Remote sensing, Synthetic Data Generation
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
