Data from: Investigating human repeatability of a computer vision based task to identify meristems on a potato plant (Solanum tuberosum)

Harris, Edwin W.; Wager, Georgina A.; Butler, Matt; Mhango, Joseph M.; Monaghan, James M.; Green, Richard

Found an issue? Give us feedback

ZENODOarrow_drop_down

ZENODO

Other ORP type . 2022

License: CC BY

Data sources: Datacite

ZENODO

Other ORP type . 2022

License: CC BY

Data sources: Datacite

ZENODO

Other ORP type . 2022

License: CC BY

Data sources: ZENODO

Data from: Investigating human repeatability of a computer vision based task to identify meristems on a potato plant (Solanum tuberosum)

appsOther research productkeyboard_double_arrow_right Other ORP type 17 Feb 2022Publisher:Zenodo

Authors: Harris, Edwin W.; Wager, Georgina A.; Butler, Matt; Mhango, Joseph M.; Monaghan, James M.; Green, Richard;

doi: 10.5281/zenodo.5799424 , 10.5281/zenodo.5799425

Data from: Investigating human repeatability of a computer vision based task to identify meristems on a potato plant (Solanum tuberosum)

- Summary
- Metrics

Abstract

Contents •"SURNAME-stem-repeatability.zip" file contains folder with 30 images (10 unique) and bounding box data capture program "desktop_image_labeller.py" •"Tuber-stem-repeatability-instructions.docx"- instructions for observers taking part in the computer experiment •"further-information-distance-between-centres.docx" - details on the distance between centers measurement •"boxes.xlsx" - dataset explaining the information for each individual bounding box •"c_dist.xlsx" - dataset for the distance between the centers •"stems.xlsx" - dataset providing information on the number of stems identified per image • "repeatability-images-cheat-sheet.xlsx" – dataset providing a key to the unique images that have been replicated three times •"DRYAD-README.docx" - Use notes for repeatability data capture and associated documents Funding provided by: UK Research and InnovationCrossref Funder Registry ID: http://dx.doi.org/10.13039/100014013Award Number: 48752

Labelled image acquisition for repeatability was carried out by multiple observers, each identifying bounding boxes of the apical meristems on potato plants from images. Additionally, repeatability of bounding box identification was assessed by two separate methods, 'live labelling' (an expert was present indicating the centre of each meristem) and 'computer labelling' (the observer identified the bounding boxes without an expert supervising). Labelling was performed on n=10 unique images, a total of three times each (thus obtaining n = 30 bounding box sets per observer). In this experiment, ten observers completed the computer labelling task, and 3 observers also completed the live labelling task. Bounding box coordinates were captured via a graphical user interface program, adapted from the popular program Yolo_mark (https://github.com/weharris/yolo_mark_utility).

Labelled training data in artificial intelligence (AI) is used to teach so-called 'supervised learning models'. However, such data may contain error or bias, which can impact model prediction accuracy. Thus, obtaining accurate training data is of high importance. In applications of AI, such as in classification and detection problems, raw training data is not always made available in published research. Likewise, the process of obtaining labelled data is not always documented well enough to enable reproducibility. This training data set captures a repeatability exercise in AI training data collection for a task that is difficult for humans to perform, delineating a bounding box in a two-dimensional image of a growing apical meristem in potato plants.

Related Organizations

Harper Adams University
United Kingdom

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	0
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Usage byUsageCounts

visibility	views	4
download	downloads	9

4
views
9
downloads
Powered by

Found an issue? Give us feedback

visibility

download

0

Average

4

9