Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ ZENODOarrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Other ORP type . 2025
License: CC BY
Data sources: ZENODO
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Other ORP type
License: CC BY
Data sources: ZENODO
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Other ORP type . 2024
License: CC BY
Data sources: ZENODO
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Other ORP type . 2024
License: CC BY
Data sources: ZENODO
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Other ORP type . 2022
License: CC BY
Data sources: Datacite
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Other ORP type . 2022
License: CC BY
Data sources: ZENODO
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Other ORP type . 2024
License: CC BY
Data sources: ZENODO
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Other ORP type . 2024
License: CC BY
Data sources: ZENODO
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Other ORP type
License: CC BY
Data sources: ZENODO
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Other ORP type . 2022
License: CC BY
Data sources: Datacite
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Other ORP type . 2022
License: CC BY
Data sources: ZENODO
ZENODO
Other ORP type . 2025
License: CC BY
Data sources: Datacite
ZENODO
Other ORP type . 2019
License: CC BY
Data sources: Datacite
ZENODO
Other ORP type . 2024
License: CC BY
Data sources: Datacite
ZENODO
Other ORP type . 2024
License: CC BY
Data sources: Datacite
ZENODO
Other ORP type . 2024
License: CC BY
Data sources: Datacite
ZENODO
Other ORP type . 2024
License: CC BY
Data sources: Datacite
ZENODO
Other ORP type . 2019
License: CC BY
Data sources: Datacite
ZENODO
Other ORP type . 2019
License: CC BY
Data sources: Datacite
versions View all 10 versions
addClaim

Mangalam Corpus of Buddhist Sanskrit Literature

Abstract

This is a corpus of Buddhist Sanskrit Literature developed for the study of Buddhist Sanskrit lexicology. It comprises: 368 lemmatized and metadata-enriched Buddhist Sanskrit texts for a total of ~ 7 million words. a tokenised reference corpus of general Sanskrit including 267 texts for a total of ~ 13 million words a metadata table with information about each text in the Buddhist and Reference corpora stemmed and normalised version of the Buddhist corpus & sketch grammar for use in Sketch Engine for questions and feedback, please contact Ligeia Lugli, project director: ligeia.lugli@kcl.ac.uk Lemmatization notes The corpora are in romanised Sanskrit (UTF-8 encoding). Where multiple spelling variants involving a nasals are attested, we have normalised the spelling to ṃ. Verbs are lemmatized to the stem of the present indicative of third person singular; the verb root can be found in the Root column. We have replaced avagraha with a. Data Quality & Limitations We are grateful to have received an Ashoka grant from the Khyentse Foundation to proofread samples of the Buddhist Sanskrit Corpus. Still, only 1.6% of the corpus has been proofread and many segmentation and lemmatization errors are likely to remain. Quantitative evaluation based on ~9000 proofread sentences puts pre-processing accuracy at ~90% (0.912 accuracy and 0.899 F1 score) Acknowledgments The corpus had been first realised as part of the project 'Lexis and Tradition: variation in the vocabulary of Sanskrit Mahāyāna literature'. This project was funded by the British Academy through a Newton International Fellowship (NF161436) and hosted at the Department of Theology and Religious Studies at King's College London under the supervision of Prof. Henrietta Kate Crosby. It has subsequently been expanded and its accuracy improved with funding from the Khyentse Foundation (Ashoka Grant 2021-2022). Dr. Bruno Galasek-Hul has contributed to versions 1.4 - 1.7 thanks to funding from the Mangalam Research Center for Buddhist Languages. Dr Anuja Ajotikar, Madhusan Rimal & Jai Paranjape have proofread sentences sampled from versions 1.7 to 2.0, thanks to funding from the Khyentse Foundation. The reference corpus of general Sanskrit has been tokenised by Matej Martinc within the project 'Computing the Dharma' funded by the National Endowment for the Humanities (HAA-277246-21). Thanks to GRETIL, CTS e-texts, Vinita Tseng, Jowita Kramer and Prof. Steinkellner for kindly giving their permission to include automatically processed versions of some of their editions in this corpus. Changelog version 2.0 changes the title of the corpus, adds more Buddhist texts and improves pre-processing accuracy. version 1.9 adds more Buddhist texts, has better segmentation and lemmatization and is partially proofread version 1.8 adds more Buddhist texts and is partially proofread version 1.7 adds more Buddhist texts and a new pre-processed corpus of general Sanskrit version 1.6 adds more Buddhist texts, improves segmentation and adds an initial iteration of the lemmatised corpus version 1.5 adds more Buddhist texts, removes the reference corpus and improves segmentation version 1.4 adds 59 Buddhist texts and fixes some recurrent segmentation errors version 1.4.1 corrects some spacing and sentence parsing errors

This corpus is being proofread thanks to funding from the Khyentse Foundation

Keywords

Sanskrit, Buddhist Sanskrit, corpus

  • BIP!
    Impact byBIP!
    selected citations
    These citations are derived from selected sources.
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    0
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
    OpenAIRE UsageCounts
    Usage byUsageCounts
    visibility views 84
    download downloads 21
  • 84
    views
    21
    downloads
    Powered byOpenAIRE UsageCounts
Powered by OpenAIRE graph
Found an issue? Give us feedback
visibility
download
selected citations
These citations are derived from selected sources.
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
views
OpenAIRE UsageCountsViews provided by UsageCounts
downloads
OpenAIRE UsageCountsDownloads provided by UsageCounts
0
Average
Average
Average
84
21