Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ ZENODOarrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Conference object . 2025
License: CC BY
Data sources: ZENODO
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Article . 2025
License: CC BY
Data sources: Datacite
ZENODO
Article . 2025
License: CC BY
Data sources: Datacite
versions View all 5 versions
addClaim

Assessing the suitability of forensic authorship analysis methodologies for speech data

Authors: Tompkinson, James; Nini, Andrea;

Assessing the suitability of forensic authorship analysis methodologies for speech data

Abstract

The development of new analytical methods and frameworks which could be integrated into forensic speaker comparison (FSC) work is a core focus for research in forensic speech science. In this paper, we explore the applicability of methods that have been used in forensic authorship analysis (FAA) to speech data. Our work has two main areas, 1) whether methods borrowed from authorship analysis can be used to analyse discrete phonetic variables using a likelihood-ratio based framework and 2) whether the embedding of auditory phonetic analysis with “higher order” features (Gold and French 2011) such as lexis, grammar and morphology, which are frequently considered in FAA tasks, can be used for speaker comparison. Our work builds on research by Sergidou et al. (2023), who showed that frequent words did have some speaker discriminatory power, and argued that this could be useful in FSC casework. We expand this work to examine how phonetic variation can be incorporated into such a framework. We analysed transcribed speech data from a random sample of 30 speakers from the West Yorkshire Regional English Database (Gold 2020) across two different speaking styles (Task 1 and Task 2), using two well-known authorship analysis methods which incorporate the likelihood ratio (LR) framework: Cosine Delta (Ishihara 2021) and Phi n-gram tracing (Nini 2023). We applied these methods to transcripts which had been adapted to represent a range of phonetic features - vocalised hesitation markers, syllable-initial realisations of /θ/, intervocalic word-medial /t/, syllable-initial /l/ and realisations of the -ing suffix - to assess 1) whether algorithms used in FAA are similarly effective on phonetic feature sets of this kind and 2) whether the combination of “higher-order” linguistic features with segmental phonetic analysis would achieve greater speaker discriminatory power. Our findings support previous research which has suggested that methods used to discriminate between authors can be usefully applied to transcribed speech data. We find that Cosine Delta and N-gram tracing are both effective in performing speaker comparison on transcribed speech data. In addition, our results show how a logistic regression calibrated Cosine Delta using the consonant phonetic features alone already offers valuable information. The analytical framework for this project, where phonetic information is embedded in transcripts and then subjected to authorship analysis techniques using the likelihood ratio paradigm, could potentially be used as a way of systematically evaluating auditory phonetic variables within a likelihood-ratio approach even when the phonetic features are discrete.ReferencesGold, E. (2020). WYRED - West Yorkshire Regional English Database 2016-2019. [data collection]. UK Data Service. SN: 854354, DOI: 10.5255/UKDA-SN-854354Ishihara, Shunichi. 2021. Score-based likelihood ratios for linguistic text evidence with a bag-of-words model. Forensic Science International. Elsevier 327. 110980.Nini, A. (2023). A Theory of Linguistic Individuality for Authorship Analysis. Elements in Forensic Linguistics. Cambridge University Press.Sergidou, E. K., Scheijen, N., Leegwater, J., Cambier-Langeveld, T., & Bosma, W. (2023). Frequent-words analysis for forensic speaker comparison. Speech Communication, 150, 1-8.

Country
United Kingdom
Related Organizations
  • BIP!
    Impact byBIP!
    selected citations
    These citations are derived from selected sources.
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    0
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
Powered by OpenAIRE graph
Found an issue? Give us feedback
selected citations
These citations are derived from selected sources.
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
0
Average
Average
Average
Green