software . 2018

Trove Newspaper Harvester

Sherratt, Tim;
Open Access
  • Published: 01 Jan 2018
  • Publisher: figshare
Abstract
<div>The Trove Newspaper Harvester is a command-line tool written in Python that helps you download large quantities of digitised newspaper articles from Trove .</div><div><br></div><div>Instead of working your way through page after page of search results using Trove’s web interface, the newspaper harvester will save the results of your search to a CSV (spreadsheet) file which you can then filter, sort, or analyse.</div><div><br></div><div>Even better, the harvester can save the full OCRd (and possibly corrected) text of each article to an individual file. You could, for example, collect the text of thousands of articles on a particular topic and then feed them...
Subjects
ACM Computing Classification System: ComputingMethodologies_DOCUMENTANDTEXTPROCESSING
free text keywords: Digital Humanities, Trove, newspapers, harvesting
Communities
Digital Humanities and Cultural Heritage
Download fromView all 6 versions
figshare
Software . 2018
Provider: Datacite
figshare
Software . 2018
Provider: Datacite
figshare
Software . 2018
Provider: Datacite
figshare
Software . 2018
Provider: figshare
Any information missing or wrong?Report an Issue