Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ ZENODOarrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Dataset . 2023
License: CC BY
Data sources: Datacite
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Dataset . 2023
License: CC BY
Data sources: ZENODO
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Dataset . 2023
License: CC BY
Data sources: ZENODO
ZENODO
Dataset . 2023
License: CC BY
Data sources: Datacite
ZENODO
Dataset . 2023
License: CC BY
Data sources: Datacite
versions View all 3 versions
addClaim

InsectSet47 & InsectSet66: Expanded datasets for automatic acoustic identification of insects (Orthoptera and Cicadidae)

Authors: Faiß, Marius;

InsectSet47 & InsectSet66: Expanded datasets for automatic acoustic identification of insects (Orthoptera and Cicadidae)

Abstract

Updated full version with training, validation and test sets. Two newly compiled datasets for training neural networks to automatically identify insect species while comparing adaptive, waveform-based frontends to conventional mel-spectrogram frontends for audio feature extraction. This work was published in PLOS Computational Biology and the machine learning implementations were published on Github. These datasets expand on the previously published InsectSet32 by including recently published collections of insect recordings by citizen scientists from around the world. Recordings from BioAcoustica, xeno-canto and iNaturalist, as well as private collections by Baudewijn Odé were downloaded and manually inspected. Files with strong noise interference or intense filtering, as well as files containing sounds of multiple species were removed to compile these datasets. The files were standardised to 44.1 kHz mono WAV files ranging in length from less than one second to several minutes. Files containing long periods without insect sounds were edited into multiple smaller files with silent periods no longer than 5 seconds. These files are marked as edits in the annotation file and should be assigned together into train/validation/test sets to prevent data leakage. The annotation files contain information for each recording, including the file name, species name and identifier, as well as the data subset they were included in for training the neural network (training, test, validation). InsectSet47 expands on InsectSet32 with recordings from xeno-canto and contains 1006 original recordings from 47 species, with at least ten files per species. The total length of InsectSet47 is 22 hours. InsectSet66 further expands on InsectSet47 by adding research-grade audio observations from iNaturalist, with a total of 1554 recordings from 66 species, a total length of over 24 hours and a minimum of ten files per species. The datasets were split into the training, validation and test sets while ensuring a roughly equal distribution of audio files and audio material for every species in all three subsets. This resulted in a 60/20/20 split (train/validation/test) by file number and a 64/19.5/16.5 split by file length.

Related Organizations
Keywords

bioacoustics, cicadidae, orthoptera, insecta, machine listening, ecology, remote monitoring

  • BIP!
    Impact byBIP!
    selected citations
    These citations are derived from selected sources.
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    1
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
    OpenAIRE UsageCounts
    Usage byUsageCounts
    visibility views 119
    download downloads 87
  • 119
    views
    87
    downloads
    Powered byOpenAIRE UsageCounts
Powered by OpenAIRE graph
Found an issue? Give us feedback
visibility
download
selected citations
These citations are derived from selected sources.
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
views
OpenAIRE UsageCountsViews provided by UsageCounts
downloads
OpenAIRE UsageCountsDownloads provided by UsageCounts
1
Average
Average
Average
119
87
Related to Research communities