
<script type="text/javascript">
<!--
document.write('<div id="oa_widget"></div>');
document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=undefined&type=result"></script>');
-->
</script>
Open Government Data (OGD) has the potential to support social and economic progress. However, this potential can be frustrated if this data remains unused. Although the literature suggests that OGD datasets' metadata quality is one of the main factors affecting their use, to the best of our knowledge, no quantitative study provided evidence of this relationship. Considering about 400,000 datasets of 28 national, municipal, and international OGD portals, we have programmatically analyzed their usage, their metadata quality, and the relationship between the two. Our analysis has highlighted three main findings. First of all, regardless of their size, the software platform adopted, and their administrative and territorial coverage, most OGD datasets are underutilized. Second, OGD portals pay varying attention to the quality of their datasets’ metadata. Third, we did not find clear evidence that datasets usage is positively correlated to better metadata publishing practices. Finally, we have considered other factors, such as datasets’ category, and some demographic characteristics of the OGD portals, and analyzed their relationship with datasets usage, obtaining partially affirmative answers. The dataset consists of three zipped CSV files, containing the collected datasets' usage data, full metadata, and computed quality values, for about 400,000 datasets belonging to the 8 national, 4 international, and 16 US municipalities OGD portals considered in the study. Data collection occurred in the period: 2019-12-19 -- 2019-12-23. ________________________________________ Portal #Datasets Platform ________________________________________ US 261,514 CKAN France 39,412 Other Colombia 9,795 Socrata IE 9,598 CKAN Slovenia 4,892 CKAN Poland 1,032 Other Latvia 336 CKAN Puerto Rico 178 Socrata New York, NY 2,771 Socrata Baltimore, MD 2,617 Socrata Austin, TX 2,353 Socrata Chicago, IL 1,368 Socrata San Francisco, CA 1,001 Socrata Dallas, TX 1,001 Socrata Los Angeles, CA 943 Socrata Seattle, WA 718 Socrata Providence, RI 288 Socrata Honolulu, HI 244 Socrata New Orleans, LA 215 Socrata Buffalo, NY 213 Socrata Nashville, TN 172 Socrata Boston, MA 170 CKAN Albuquerque, NM 60 CKAN Albany, NY 50 Socrata HDX 17,325 CKAN EUODP 14,058 CKAN NASA 9,664 Socrata World Bank Finances 2,177 Socrata ________________________________________ The three datasets share the same table structure: Table Fields portalid: portal identifier id: dataset identifier engine: identifier of the supporting portal platform: 1(CKAN), 2 (Socrata) admindomain: 1 (National), 2 (US), 3 (International) downloaddate: date of data collection views: number of total views for the dataset downloads: number of total downloads for the dataset overallq: overall quality values computed by applying the methodology presented by Neumaier et al. in [1] qvalues: json object containing the quality values computed for the 17 metrics presented in by Neumaier et al. [1] assessdate: date of quality assessment metadata: the overall dataset's metadata downloaded via API from the portal according to the supporting platform schema [1] Neumaier, S.; Umbrich, J.; Polleres, A. Automated Quality Assessment of Metadata Across Open Data Portals.J. Data and Information Quality2016,8, 2:1–2:29. doi:10.1145/2964909
the dataset is created to support the analysis presented in: Quarati, Alfonso; "Open government data: usage trends and metadata quality", Journal of Information Science, 2021, DOI:10.1177/01655515211027775
Metadata quality, FAIR compliance, Open Government Data usage
Metadata quality, FAIR compliance, Open Government Data usage
citations This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 1 | |
popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
views | 20 | |
downloads | 12 |