publication . Part of book or chapter of book . 2009

Creating Digital Resources from Legacy Documents: An Experience Report from the Biosystematics Domain

Guido Sautter; Klemens Böhm; Donat Agosti; Christiana Klingenberg;
Open Access
  • Published: 31 Dec 2009
  • Publisher: Springer Link
Digitized legacy document marked up with XML can be used in many ways, e.g., to generate RDF statements about the world described. A prerequisite for doing so is that the document markup is of sufficient quality. Since fully automated markup-generation methods cannot ensure this, manual corrections and cleaning are indispensable. In this paper, we report on our experiences from a digitization and markup project for a large corpus of legacy documents from the biosystematics domain, with a focus on the use of modern tools. The markup created covers both document structure and semantic details. In contrast to previous markup projects reported on in literature, our ...
ACM Computing Classification System: ComputingMethodologies_DOCUMENTANDTEXTPROCESSING
free text keywords: biodiversity, semantic web, biodiversity informatics, digital literature, Markup language, World Wide Web, Semantic Web Rule Language, Information retrieval, Document Definition Markup Language, XHTML, computer.programming_language, computer, SGML, computer.file_format, RuleML, Collaborative Application Markup Language, XML, computer.internet_protocol, Computer science
Download fromView all 2 versions
Part of book or chapter of book . 2009
Provider: ZENODO
Part of book or chapter of book
Provider: UnpayWall
Part of book or chapter of book
Provider: Crossref
Powered by OpenAIRE Research Graph
Any information missing or wrong?Report an Issue