Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ ZENODOarrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Dataset . 2023
License: CC BY
Data sources: ZENODO
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Dataset . 2023
License: CC BY
Data sources: ZENODO
ZENODO
Dataset . 2023
License: CC BY
Data sources: Datacite
ZENODO
Dataset . 2023
License: CC BY
Data sources: Datacite
versions View all 2 versions
addClaim

This Research product is the result of merged Research products in OpenAIRE.

You have already added 0 works in your ORCID record related to the merged Research product.

The Curated Courier: Digital Text Corpora from the UNESCO Courier (1948–2020)

Authors: Martin, Benjamin George; Mohammadi Norén, Fredrik; Mähler, Roger; Marklund, Andreas; Martin, Oriane Mathilde;

The Curated Courier: Digital Text Corpora from the UNESCO Courier (1948–2020)

Abstract

Founded in 1948 as the official magazine of the United Nations Educational, Scientific and Cultural Organization, The UNESCO Courier represents an extraordinary resource for research on global themes in the humanities. The complete archive of the magazine is available in PDF form through UNESCO. These files make it possible for users anywhere to read individual issues, but it does not allow for full-text searching, much less any of the computational text analysis methods that have recently made important advances in humanities research. The Curated Courier 1.0 is a package of digital text corpora, text analysis tools, and supplementary materials that makes the complete archive of The UNESCO Courier from 1948 to 2020 machine-readable, accessible, and reusable for digital text analysis. Here on Zenodo we publish two Courier corpora. The first corpus (curated_courier_article_corpus) consists of the texts of all articles published in the English-language edition of The UNESCO Courier between 1948 and 2020. For this corpus we have extracted and reconstructed the complete text of all articles, for example by pulling together non-contiguous pages where necessary and by removing non-article text (masthead, photo captions, letters to the editor, and so on). We have linked each article to a comprehensive curated metadata index, included in the download (document_index.csv). The second corpus (curated_issues) compiles the complete text of all Courier issues (English-language edition), 1948-2020. To prepare this corpus we extracted text from the PDFs that UNESCO has made available, used multiple modes of OCR, and rendered each issue as a simple text file. Our test of the OCR quality finds an average error rate of 0.7 %, which should be considered good quality. Working data from the process can be found in our GitHub repository "tagged Courier." The products, text analysis tools, and additional documentation are in the repository "Curated Courier." The text of The UNESCO Courier is available in Open Access under the Attribution-ShareAlike 3.0 IGO (CC-BY-SA 3.0 IGO) license, in the context of UNESCO's open access publications policy. This dataset is published under the most recent version of the same license: Attribution-ShareAlike 4.0 International (CC BY-SA 4.0 Deed). These datasets was developed as part of the research project "International Ideas at UNESCO: Digital Approaches to Global Conceptual History" (INIDUN), led by Benjamin G. Martin at Uppsala University and funded by a grant from the Swedish Research Council (Vetenskapsrådet), 2020-2024. For more information, see: https://inidun.github.io, as well as the project repository on GitHub, which includes documentation and files related to the curating process.

Keywords

UNESCO, international organizations, international relations, history, digital humanities

  • BIP!
    Impact byBIP!
    citations
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    0
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
Powered by OpenAIRE graph
Found an issue? Give us feedback
citations
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
0
Average
Average
Average