
The CO.RA.PAN Web Application is the software platform that provides public, research-oriented access to the Corpus Radiofónico Panhispánico (CO.RA.PAN). It offers modules for corpus search, metadata filtering, statistical summaries, geolinguistic visualization, and user-oriented exploration of annotated transcript data. The application also provides synchronized audio–text playback for authenticated users with access to the restricted Full Corpus. The system integrates BlackLab Server for high-performance linguistic search over the structured transcript data of the CO.RA.PAN corpus. The underlying corpus infrastructure is based on JSON-formatted transcripts produced by the project’s data pipeline, which includes ASR-based transcription, manual quality control, segmentation, and automated linguistic annotation using spaCy. The Web Application does not generate data files; it exposes them through a unified analytical and exploratory interface. This Zenodo record provides the persistent identifier for the software component only.The corpus data (audio and JSON transcripts) are not included here and are available under restricted access in a separate Zenodo dataset due to copyright and broadcasting constraints. CO.RA.PAN References and Related Resources CO.RA.PAN Full Corpus (Restricted)DOI: https://doi.org/10.5281/zenodo.15360942 CO.RA.PAN Sample Corpus (Public)DOI: https://doi.org/10.5281/zenodo.15378479 CO.RA.PAN Metadata (Public)DOI: https://doi.org/10.5281/zenodo.17843469 CO.RA.PAN Web ApplicationDOI: https://doi.org/10.5281/zenodo.17834023 Web Application Access Public Web App (authenticated access for restricted corpus content):https://corapan.online.uni-marburg.de Source code and deployment documentation:https://github.com/FTacke/corapan-webapp Project Overview Further documentation and related digital humanities projects:https://hispanistica.online.uni-marburg.de/
If you use this software or its infrastructure, please cite it using the metadata from this file.
audio transcription, corpus linguistics, spaCy, pluricentric Spanish, web application, corpus search, Spanish linguistics, spoken language, Flask, linguistic annotation, panhispanic variation, ASR, BlackLab Server, model speaker, geolinguistics
audio transcription, corpus linguistics, spaCy, pluricentric Spanish, web application, corpus search, Spanish linguistics, spoken language, Flask, linguistic annotation, panhispanic variation, ASR, BlackLab Server, model speaker, geolinguistics
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
