Downloads provided by UsageCounts
ABSTRACT Historical newspapers are increasingly accessed digitally for different purposes both by professional and lay users. These ever-growing historical collections are usually formed by utilizing Optical Character Recognition (OCR), which may introduce noise to the texts. This subsequently leads to compromised information retrieval (IR) performance and user understanding. The effect of OCR noise on IR performance has been studied earlier by utilizing artificially degraded OCR quality texts (see, e.g., [2, 15]), test collection containing documents with authentic low OCR quality [12], or by gathering end-user impressions [23]. However, it remains challenging to measure how the user’s subjective perception is affected by the amount of OCR noise remaining in the documents. Recently, the National Library of Finland has set up an experimental system which allows studying this issue. The system allows presenting each underlying historical document as two alternatives – either based on the baseline OCR quality, or on the new, improved OCR quality. This set up facilitates studying the effects of OCR quality changes on the user’s subjective perception of the document. Following Gäde et al. [8] we describe in this paper the research design, infrastructure, and research data utilized in a recent user experiment of Kettunen et al. [19] entailing thirty-two test subjects performing simulated work tasks [4] and discuss the prospects of reuse of the experimental components of the study. So far, the system has been used in one experiment in which the subjects performed simulated tasks. However, the research design and its general model could be utilized in the future to study the effects of OCR quality on professional settings entailing historians performing naturalistic phases of their research tasks. **************************************************************************************************************************************************** BIIRRR 2022 Third Workshop on Building towards Information Interaction and Retrieval Resources Re-use
User Study, Resource reuse, Historical newspaper collections, Interactive Information Retrieval, OCR quality, Simulated Work Task, BIIRRR2022, Evaluation
User Study, Resource reuse, Historical newspaper collections, Interactive Information Retrieval, OCR quality, Simulated Work Task, BIIRRR2022, Evaluation
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
| views | 46 | |
| downloads | 19 |

Views provided by UsageCounts
Downloads provided by UsageCounts