Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ ZENODOarrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Dataset . 2025
License: CC BY SA
Data sources: ZENODO
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Dataset . 2025
License: CC BY SA
Data sources: ZENODO
ZENODO
Dataset . 2025
License: CC BY SA
Data sources: Datacite
ZENODO
Dataset . 2025
License: CC BY SA
Data sources: Datacite
ZENODO
Dataset . 2025
License: CC BY SA
Data sources: Datacite
versions View all 3 versions
addClaim

dlexDB – annotated lexical data

Authors: Kliegl, Reinhold; Hanneforth, Thomas; Geyken, Alexander; Würzner, Kay-Michael; Heister, Julian; Pohl, Edmund; Bubenzer, Johannes; +1 Authors

dlexDB – annotated lexical data

Abstract

Dieser Datensatz enthält die im Projekt dlexDB erhobenen und annotierten lexikalischen Daten. Das Projekt dlexDB wurde von der Deutschen Forschungsgemeinschaft (DFG) unter der Fördernummer 206617755 gefördert (KL 955/12-1 und KL 955/19-1), um eine umfassende lexikalische Datenbank für die psychologische und linguistische Forschung zu erstellen. Dabei handelte es sich um ein Projekt der Professuren für allgemeine Psychologie 1 und theoretische Computerlinguistik an der Universität Potsdam sowie dem Projekt Digitales Wörterbuch der deutschen Sprache (DWDS) an der Berlin-Brandenburgischen Akademie der Wissenschaften. Die Datenbank basiert auf dem Kernkorpus der deutschen Sprache des 20. Jahrhunderts, das vom DWDS zusammengestellt wurde. Der Datensatz umfasst statistische Kennwerte und Häufigkeitsinformationen zu Types, Lemmata, Silben und Zeichen. Die Daten sind in mehreren TSV-Dateien (tabseparierte Werte) organisiert, wobei jede Datei einer spezifischen Tabelle entspricht. Zusätzlich wird eine umfassende Dokumentation im Markdown-Format bereitgestellt, die detaillierte Erläuterungen zu den Tabellen und einzelnen Spalten enthält. Der Datensatz bietet: Häufigkeitsbasierte Normen für zahlreiche verarbeitungsrelevante Worteigenschaften Statistische Kennwerte für Wortformen und Lemmata Häufigkeiten von Silben, Morphemen und Zeichenfolgen Wortähnlichkeitsmaße Ursprünglich als Online-Datenbank konzipiert, wird dieser Datensatz nun auf Zenodo archiviert, um langfristigen Zugang und Nachnutzbarkeit zu gewährleisten. Die Projektbibliografie ist bei Zotero verfügbar. Eine archivierte Version der Projektwebseite www.dlexdb.de findet sich im Internet Archive. Eine aktuelle Frequenzliste mit Daten des Kernkorpus der deutschen Sprache des 20. Jahrhunderts stellt das DWDS unter https://www.dwds.de/r/lexdb#kern bereit.

This dataset contains the lexical data collected and annotated in the dlexDB project. The dlexDB project was funded by the German Research Foundation (DFG) under grant number 206617755 (KL 955/12-1 and KL 955/19-1), with the aim of creating a comprehensive lexical database for psychological and linguistic research. It was a collaboration between the chairs of General Psychology I and Theoretical Computational Linguistics at the University of Potsdam, and the project Digitales Wörterbuch der deutschen Sprache (DWDS) at the Berlin-Brandenburg Academy of Sciences and Humanities. The database is based on the core corpus of the German language of the 20th century, compiled by the DWDS project. The dataset includes statistical metrics and frequency information on types, lemmas, syllables, and characters. The data are organized in several TSV (tab-separated values) files, with each file corresponding to a specific table. In addition, comprehensive documentation is provided in Markdown format, offering detailed explanations of the tables and individual columns. The dataset offers: Frequency-based norms for numerous word properties relevant to processing Statistical metrics for word forms and lemmas Frequencies of syllables, morphemes, and character sequences Measures of word similarity Originally designed as an online database, this dataset is now archived on Zenodo to ensure long-term access and reusability. The project bibliography is available on Zotero. An archived version of the project website www.dlexdb.de can be found on the Internet Archive. A current frequency list based on data from the core corpus of the German language of the 20th century is provided by DWDS at https://www.dwds.de/r/lexdb#kern.

Keywords

Psycholinguistics, FOS: Languages and literature, Linguistics, corpora

  • BIP!
    Impact byBIP!
    selected citations
    These citations are derived from selected sources.
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    0
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
Powered by OpenAIRE graph
Found an issue? Give us feedback
selected citations
These citations are derived from selected sources.
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
0
Average
Average
Average