
<script type="text/javascript">
<!--
document.write('<div id="oa_widget"></div>');
document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=undefined&type=result"></script>');
-->
</script>
# Cape Hatteras Landsat8 RGB Images and Labels for Image Segmentation using the program, Segmentation Gym ## Overview * Test datasets and files for testing the [segmentation gym](https://github.com/Doodleverse/segmentation_gym) program for image segmentation * Data set made by Daniel Buscombe, Marda Science LLC. * Dataset consists of a time-series of Landsat-8 images of Cape Hatteras National Seashore, courtesy of the U.S. Geological Survey. * Imagery spans the period February 2015 to September 2021. * Labels were created by Daniel Buscombe, Marda Science, using the labeling program [Doodler](https://github.com/Doodleverse/dash_doodler). Download this file and unzip to somewhere on your machine (although *not* inside the `segmentation_gym` folder), then see the relevant page on the [segmentation gym wiki](https://github.com/Doodleverse/segmentation_gym/wiki) for further explanation. ## file structure ```{sh} /Users/Someone/my_segmentation_zoo_datasets │ ├── config │ | └── *.json │ ├── capehatteras_data | | ├── fromDoodler | | | ├──images │ | | └──labels | | ├──npzForModel │ | └──toPredict │ └── modelOut │ └── *.png │ └── weights │ └── *.h5 ``` ## config There are 3 config files: 1. `/config/hatteras_l8_resunet.json` 2. `/config/hatteras_l8_vanilla_unet.json` 3. `/config/hatteras_l8_resunet_model2.json` The first two are for res-unet and unet models respectively. The last one differs from the first only with specification of kernel size. It is provided as an example of how to conduct model training experiments, modifying one hyperparameter at a time in the effort to create an optimal model. They all contain the same essential information and differ as indicated below ``` { "TARGET_SIZE": [768,768], # the size of the imagery you wish the model to train on. This may not be the original size "MODEL": "resunet", # model name. Otherwise, "unet" "NCLASSES": 4, # number of classes "KERNEL":9, # horizontal size of convolution kernel in pixels "STRIDE":2, # stride in convolution kernel "BATCH_SIZE": 7, # number of images/labels per batch "FILTERS":6, # number of filters "N_DATA_BANDS": 3, # number of image bands "DROPOUT":0.1, # amount of dropout "DROPOUT_CHANGE_PER_LAYER":0.0, # change in dropout per layer "DROPOUT_TYPE":"standard", # type of dropout. Otherwise "spatial" "USE_DROPOUT_ON_UPSAMPLING":false, # if true, dropout is used on upsampling as well as downsampling "DO_TRAIN": false, # if false, the model will not train, but you will select this config file, data directory, and the program will load the model weights and test the model on the validation subset if true, the model will train from scratch (warning! this will overwrite the existing weights file in h5 format) "LOSS":"dice", # model training loss function, otherwise "cat" for categorical cross-entropy "PATIENCE": 10, # number of epochs of no model improvement before training is aborted "MAX_EPOCHS": 100, # maximum number of training epochs "VALIDATION_SPLIT": 0.6, #proportion to use for validation "RAMPUP_EPOCHS": 20, # [LR-scheduler] rampup to maximim "SUSTAIN_EPOCHS": 0.0, # [LR-scheduler] sustain at maximum "EXP_DECAY": 0.9, # [LR-scheduler] decay rate "START_LR": 1e-7, # [LR-scheduler] start lr "MIN_LR": 1e-7, # [LR-scheduler] min lr "MAX_LR": 1e-4, # [LR-scheduler] max lr "FILTER_VALUE": 0, #if >0, the size of a median filter to apply on outputs (not recommended unless you have noisy outputs) "DOPLOT": true, #make plots "ROOT_STRING": "hatteras_l8_aug_768", #data file (npz) prefix string "USEMASK": false, # use the convention 'mask' in label image file names, instead of the preferred 'label' "AUG_ROT": 5, # [augmentation] amount of rotation in degrees "AUG_ZOOM": 0.05, # [augmentation] amount of zoom as a proportion "AUG_WIDTHSHIFT": 0.05, # [augmentation] amount of random width shift as a proportion "AUG_HEIGHTSHIFT": 0.05,# [augmentation] amount of random width shift as a proportion "AUG_HFLIP": true, # [augmentation] if true, randomly apply horizontal flips "AUG_VFLIP": false, # [augmentation] if true, randomly apply vertical flips "AUG_LOOPS": 10, #[augmentation] number of portions to split the data into (recommended > 2 to save memory) "AUG_COPIES": 5 #[augmentation] number iof augmented copies to make "SET_GPU": "0" #which GPU to use. If multiple, list separated by a comma, e.g. '0,1,2'. If CPU is requested, use "-1" "WRITE_MODELMETADATA": false, #if true, the prompts `seg_images_in_folder.py` to write detailed metadata for each sample file "do_crf": true #if true, apply CRF post-processing to outputs } ``` ## capehatteras_data Folder containing all the model input data ```{sh} │ ├── capehatteras_data: folder containing all the model input data | | ├── fromDoodler: folder containing images and labels exported from Doodler using [this program](https://github.com/dbuscombe-usgs/dash_doodler/blob/main/utils/gen_images_and_labels_4_zoo.py) | | | ├──images: jpg format files, one per label image │ | | └──labels: jpg format files, one per image | | ├──npzForModel: npz format files for model training using [this program](https://github.com/dbuscombe-usgs/segmentation_zoo/blob/main/train_model.py) that have been created following the workflow [documented here](https://github.com/dbuscombe-usgs/segmentation_zoo/wiki/Create-a-model-ready-dataset) using [this program](https://github.com/dbuscombe-usgs/segmentation_zoo/blob/main/make_nd_dataset.py) │ | └──toPredict: a folder of images to test model prediction using [this program](https://github.com/dbuscombe-usgs/segmentation_zoo/blob/main/seg_images_in_folder.py) ``` ## modelOut PNG format files containing example model outputs from the train ('_train_' in filename) and validation ('_val_' in filename) subsets as well as an image showing training loss and accuracy curves with `trainhist` in the filename. There are two sets of these files, those associated with the residual unet trained with dice loss contain `resunet` in their name, and those from the UNet are named with `vanilla_unet`. ## weights There are model weights files associated with each config files.
coastal landcover, land-use and land-cover, deep learning, UNet, LULC, image segmentation
coastal landcover, land-use and land-cover, deep learning, UNet, LULC, image segmentation
citations This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
views | 74 | |
downloads | 54 |