Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ ZENODOarrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Dataset . 2022
License: CC BY
Data sources: Datacite
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Dataset . 2022
License: CC BY
Data sources: ZENODO
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Dataset . 2022
License: CC BY
Data sources: Datacite
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Dataset . 2022
License: CC BY
Data sources: Datacite
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Dataset . 2022
License: CC BY
Data sources: ZENODO
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
https://doi.org/10.5281/zenodo...
Dataset . 2022
License: CC BY
Data sources: Sygma
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
https://doi.org/10.5281/zenodo...
Dataset . 2022
License: CC BY
Data sources: Sygma
versions View all 5 versions
addClaim

A Greek Parliament Proceedings Dataset for Computational Linguistics and Political Analysis

Authors: Konstantina Dritsa; Kaiti Thoma; John Pavlopoulos; Panagiotis Louridas;

A Greek Parliament Proceedings Dataset for Computational Linguistics and Political Analysis

Abstract

The dataset includes the following files: 1. tell_all_cleaned.csv: This is the file of the main dataset that includes 1,280,918 speech fragments of Greek parliament members in the order of the conversation that took place, exported from 5,355 parliamentary sitting record files, with a total volume of 2.12 GB. The speeches extend chronologically from July 1989 up to July 2020 and include the following information: member_name: the name of the individual who spoke during a sitting. sitting_date: the date the sitting took place. parliamentary_period: the name and/or number of the parliamentary period that the speech took place in. A parliamentary period is defined as the time span between one general election and the next. A parliamentary period includes multiple parliamentary sessions. parliamentary_session: the name and/or number of the parliamentary session that the speech took place in. A session is defined as a time span of usually 10 months within a parliamentary period during which the parliament can convene and function as stipulated by the constitution. A session can fall into the following categories: regular, extraordinary or special. In the intervals between the sessions the parliament is in recess. A parliamentary session includes multiple parliamentary sittings. parliamentary_sitting: the name and/or number of the parliamentary sitting that the speech took place in. A sitting is defined as a meeting of parliament members. political_party: the political party of the speaker. government: the government in force when the speech took place. member_region: the electoral district the speaker belonged to. roles: information about the parliamentary roles and/or government position of the speaker. member_gender: the gender of the speaker speech: the speech that the individual gave during the parliamentary sitting. 2. wiki_data: A folder of modern Greek female and male names and surnames and their available grammatical cases crawled from the entries of the Wiktionary Greek names category (https://en.wiktionary.org/wiki/Category:Greek_names). We produced the grammatical cases of the missing grammatical entries according to the rules of the Greek grammar and saved the files in the same folder by adding to their filenames the string "_populated.json". 3. parl_members_activity_1989onwards_with_gender.csv: The Greek Parliament website provides a list of all the elected members of parliament since the fall of the military junta in Greece, in 1974. We collected and cleaned the data, added the gender and kept the elected members from 1989 onwards, matching the available parliament proceeding records. This dataset includes the full names of the members, the date range of their service, the political party they served, the electoral district they belonged to and their gender. 4. formatted_roles_gov_members_data.csv: As government members we refer to individuals in ministerial or other government posts, regardless of whether they were elected in the parliament. This information is available in the website of the Secretariat General for Legal and Parliamentary Affairs. The government members dataset includes the full names of the official individuals, the name of the role they were given, the date range of their service at each specific role and their gender. 5. governments_1989onwards.csv: A dataset of government information including the names of governments since 1989, their start and end dates, and a URL that points to the respective official government web page of each past government. The data is crawled from the website of the Secretariat General for Legal and Parliamentary Affairs. 6. extra_roles_manually_collected.csv: A dataset with manually collected information from Wikipedia about additional government or parliament posts such as Chairman of the Parliament, party leaders, opposition leaders and other information. 7. all_members_activity.csv: A dataset of all the information of the aforementioned files 3,4,5,6 merged. Each row of the file includes the full name of the individual, the start and end date of their term of office, the political party and electoral district they belonged to, their gender, the parliamentary and/or government positions that they held along with start and end dates, and the name of the government that was in power during their term of office. An individual can change political parties or become an independent member of the parliament during a parliamentary period, thus having more than one entries/rows in the file. 8. freqs_for_semantic_shift_cleaned_data_decade1990.csv & freqs_for_semantic_shift_cleaned_data_decade2010.csv: Files of frequencies of words in the corpora of the decades 1990-1999 and 2010-2019. 9. compass_top100.csv: Top 100 most changed words between the decades 1990-1999 and 2010-2019, as computed with the use of the Compass tool by V. D. Carlo et. al. [1]. 10. compass_fc_top100.csv: Top 100 most changed words between the decades 1990-1999 and 2010-2019, as computed with the use of the Compass tool [1] in combination with the frequency cut-offs of the Gonen et. al. approach [3]. For the frequency cut-offs, the files in bullet 8 are used. 11. procrustes_top100.csv: Top 100 most changed words between the decades 1990-1999 and 2010-2019, as computed with the use of the Orthogonal Procrustes approach of Hamilton et. al. [2]. 12. nn_top100.csv: Top 100 most changed words between the decades 1990-1999 and 2010-2019, as computed with the use of the Gonen et. al. approach [3]. 13. second_order_top100.csv: Top 100 most changed words between the decades 1990-1999 and 2010-2019, as computed with the use of the Second-Order Similarity approach by Hamilton et. al. [4]. 14. top100_minfreq50.xls: An .xls file for convinient viewing of the top 100 most changed words per approach with minimum frequency of 50 occurrences, produced by merging the aforementioned files 9,10,11,12 and 13. 15. freqs_for_semantic_shift_cleaned_data_period1997_2007.csv & freqs_for_semantic_shift_cleaned_data_period2008_2018.csv: Files of frequencies of words in the corpora of the decades before (1997_2007) and during (2008_2018) the Greek economic crisis. 16. semantic_shifts_dichotomy_crisis_compass_1997_2007_2008_2018_atleast50.csv: A file with the top 100 most changed words between between the decades before (1997-2007) and during (2008-2018) the Greek economic crisis. The computations are implemented with the use of the Compass tool. 17. selected_topics_shift_per_period_compass.csv: The usage change of selected topics/words of generic political interest between pairs of consecutive parliamentary periods. The computations are implemented with the use of the Compass tool. 18. semantic_shifts_party_embeddings_per_period_merged_compass.csv: The usage change of selected political party names that have played an important role in recent political history, namely New Democracy (ND), the Panhellenic Socialist Movement (PASOK), the Coalition of the Radical Left - Progressive Alliance (SYRIZA), the Communist Party of Greece (KKE), the Coalition of the Left, of Movements and Ecology (SYN) and Golden Dawn (GD). ------------- Citations: [1] Valerio Di Carlo, Federico Bianchi, and Matteo Palmonari. Training Temporal Word Em- beddings with a Compass. In Proceedings of the Thirty–Third AAAI Conference on Artificial Intelligence, AAAI’19, pages 6326–6334, 2019. doi: 10.1609/aaai.v33i01.33016326. [2] William L. Hamilton, Jure Leskovec, and Dan Jurafsky. Diachronic Word Embeddings Reveal Statistical Laws of Semantic Change. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), ACL 2016, pages 1489– 1501, Berlin, Germany, August 2016. Association for Computational Linguistics. doi: 10. 18653/v1/P16-1141. URL https://www.aclweb.org/anthology/P16-1141. [3] Hila Gonen, Ganesh Jawahar, Djamé Seddah, and Yoav Goldberg. Simple, Interpretable and Stable Method for Detecting Words with Usage Change across Corpora. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, ACL 2020, pages 538– 555, Online, July 2020. Association for Computational Linguistics. doi: 10.18653/v1/2020.acl- main.51. URL https://aclanthology.org/2020.acl-main.51. [4] William L. Hamilton, Jure Leskovec, and Dan Jurafsky. Cultural Shift or Linguistic Drift? Comparing Two Computational Measures of Semantic Change. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, EMNLP 2016, pages 2116–2121, Austin, Texas, November 2016. Association for Computational Linguistics. doi: 10.18653/v1/D16-1229. URL https://www.aclweb.org/anthology/D16-1229. ------------- Acknowledgments: This work was supported by the European Union’s Horizon 2020 research and innovation program ``FASTEN'' under grant agreement No 825328 and the non profit data journalism organization iMEdD.org.

  • BIP!
    Impact byBIP!
    selected citations
    These citations are derived from selected sources.
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    0
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
    OpenAIRE UsageCounts
    Usage byUsageCounts
    visibility views 31
    download downloads 4
  • 31
    views
    4
    downloads
    Powered byOpenAIRE UsageCounts
Powered by OpenAIRE graph
Found an issue? Give us feedback
visibility
download
selected citations
These citations are derived from selected sources.
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
views
OpenAIRE UsageCountsViews provided by UsageCounts
downloads
OpenAIRE UsageCountsDownloads provided by UsageCounts
0
Average
Average
Average
31
4
Funded by
Related to Research communities