Downloads provided by UsageCounts
The POLIANNA dataset is a collection of legislative texts from the European Union (EU) that have been annotated based on theoretical concepts of policy design. The dataset consists of 20,577 annotated spans in 412 articles, drawn from 18 EU climate change mitigation and renewable energy laws, and can be used to develop supervised machine learning approaches for scaling policy analysis. The dataset includes a novel coding scheme for annotating text spans, and you find a description of the annotated corpus, an analysis of inter-annotator agreement, and a discussion of potential applications in the paper accompanying this dataset. The objective of this dataset to build tools that assist with manual coding of policy texts by automatically identifying relevant paragraphs. Detailed instructions and further guidance about the dataset as well as all the code used for this project can be found in the accompanying paper and on the GitHub project page. The repository also contains useful code to calculate various inter-annotator agreement measures and can be used to process text annotations generated by INCEpTION. Dataset Description We provide the dataset in 3 different formats:JSON: Each article corresponds to a folder, where the Tokens and Spans are stored in a separate JSON file. Each article-folder further contains the raw policy-text as in a text file and the metadata about the policy. This is the most human-readable format. JSONL: Same folder structure as the JSON format, but the Spans and Tokens are stored in a JSONL file, where each line is a valid JSON document. Pickle: We provide the dataset as a Python object. This is the recommended method when using our own Python framework that is provided on GitHub. For more information, check out the GitHub project page. License The POLIANNA dataset is licensed under the Creative Commons Attribution 4.0 International (CC BY 4.0) license. If you use the POLIANNA dataset in your research in any form, please cite the dataset. Citation Sewerin, S., Kaack, L.H., Küttel, J. et al. Towards understanding policy design through text-as-data approaches: The policy design annotations (POLIANNA) dataset. Sci Data10, 896 (2023). https://doi.org/10.1038/s41597-023-02801-z
This work was also supported by ETH Career Seed Grant SEED-24 19-2, funded by the ETH Zurich Foundation.
climate change, EU policy, policy design, text-as-data, energy policy
climate change, EU policy, policy design, text-as-data, energy policy
| citations This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 1 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
| views | 44 | |
| downloads | 3 |

Views provided by UsageCounts
Downloads provided by UsageCounts