Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ ZENODOarrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
versions View all 2 versions
addClaim

This Research product is the result of merged Research products in OpenAIRE.

You have already added 0 works in your ORCID record related to the merged Research product.
addClaim

This Research product is the result of merged Research products in OpenAIRE.

You have already added 0 works in your ORCID record related to the merged Research product.

Zeta & Eta: An Exploration and Evaluation of two Dispersion-based Measures of Distinctiveness

Authors: Du, Keli; Dudar, Julia; Rok, Cora; Schöch, Christof;

Zeta & Eta: An Exploration and Evaluation of two Dispersion-based Measures of Distinctiveness

Abstract

In Corpus Linguistics, numerous statistical measures have been adopted to analyze large amounts of textual data in a contrastive perspective, in order to extract characteristic or “distinctive” features. While the most widely-used keyness measures are based on word frequency, an increasing number of research papers recently suggested dispersion-based measures as a better solution. These, however, are not new to Computational Literary Studies (CLS). In 2007, John Burrows introduced Zeta, a statistical measure that is mainly based on the degree of dispersion of a feature in a text corpus. In this paper, we also introduce Eta, a new measure of distinctiveness that is based on deviation of proportions suggested by Stefan Gries. By comparing Eta with Zeta, we demonstrate that both measures are able to identify relevant, interpretable distinctive words in a target corpus. Additionally, we make a first attempt to detect the key differences between these two measures by interpreting the top distinctive words. DFG Schwerpunktprogramm SPP 2207 "Computational Literary Studies" Online: https://gepris.dfg.de/gepris/projekt/402743989 https://dfg-spp-cls.github.io/ Teilprojekt: "Zeta und Konsorten. Distinktivitätsmaße für die Digitalen Literaturwissenschaften" Online: https://gepris.dfg.de/gepris/projekt/424211690 https://dfg-spp-cls.github.io/projects_en/2020/01/24/TP-Zeta_and_Company/ https://zeta-project.eu/de/

Keywords

measure of distinctiveness, dispersion, Zeta, Eta, Computational Literary Studies, SPP 2207

23 references, page 1 of 3

[1] S. Klimek, R. Mu¨ller, Vergleich als Methode? Zur Empirisierung eines philologischen Verfahrens im Zeitalter der Digital Humanities [Abstract], JLT Articles 9 (2015). URL: http://www.jltonline.de/index.php/articles/article/view/758, number: 1.

[2] P. Rayson, G. N. Leech, M. Hodges, Social diferentiation in the use of English vocabulary: some analyses of the conversational component of the British National Corpus, International Journal of Corpus Linguistics 2 (1997) 133-152. ISBN: 1384- 6655 Publisher: John Benjamins.

[3] M. P. Oakes, M. Farrow, Use of the Chi-Squared Test to Examine Vocabulary Differences in English Language Corpora Representing Seven Diferent Countries, Literary and Linguistic Computing 22 (2007) 85-99. URL: https://academic.oup.com/ dsh/article/22/1/85/1025876. doi:10.1093/llc/fql044, publisher: Oxford Academic.

[4] M. L. Newman, C. J. Groom, L. D. Handelman, J. W. Pennebaker, Gender diferences in language use: An analysis of 14,000 text samples, Discourse Processes 45 (2008) 211-236. ISBN: 0163-853X Publisher: Taylor & Francis.

[5] J. Schor¨ter, K. Du, J. Dudar, C. Rok, C. Scho¨ch, From Keyness to Distinctiveness - Triangulation and Evaluation in Computational Literary Studies, Journal of Literary Theory (JLT) (accepted).

[6] M. Scott, PC Analysis of Key Words and Key Key Words, System 25 (1997) 233-245.

[7] L. Anthony, AntConc: Design and development of a freeware corpus analysis toolkit for the technical writing classroom, 2005, pp. 729-737. doi:10.1109/IPCC.2005. 1494244.

[8] J. Egbert, D. Biber, Incorporating text dispersion into keyword analyses, Corpora 14 (2019) 77-104. URL: https://www.euppublishing.com/doi/abs/10.3366/ cor.2019.0162. doi:10.3366/cor.2019.0162.

[9] S. T. Gries, Dispersions and adjusted frequencies in corpora, 2008. doi:10.1075/ ijcl.13.4.02gri.

[10] S. Gries, A new approach to (key) keywords analysis: Using frequency, and now also dispersion, Research in Corpus Linguistics 9 (2021) 1-33. doi:10.32714/ricl. 09.02.02.

  • BIP!
    Impact byBIP!
    citations
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    0
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
    OpenAIRE UsageCounts
    Usage byUsageCounts
    visibility views 109
    download downloads 63
  • citations
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    0
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
    Powered byBIP!BIP!
  • 109
    views
    63
    downloads
    Powered byOpenAIRE UsageCounts
Powered by OpenAIRE graph
Found an issue? Give us feedback
visibility
download
citations
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
views
OpenAIRE UsageCountsViews provided by UsageCounts
downloads
OpenAIRE UsageCountsDownloads provided by UsageCounts
0
Average
Average
Average
109
63
moresidebar

Do the share buttons not appear? Please make sure, any blocking addon is disabled, and then reload the page.