RISE Crash Course: "Information Extraction from Images with AI"

Slides for RISE Crash Course: "Information Extraction from Images with AI" In this two-hour course, you will learn how multimodal large language models, such as ChatGPT-4o, Gemini 1.5, or Claude Sonnet 3.5, can be used to extract structured information directly from images. This approach eliminates the often necessary intermediate step of text recognition and transcription that is common in traditional methods (such as Transkribus). Using concrete examples from ongoing research projects, the course will demonstrate the practical possibilities and limitations of this technology. It will also address the technical and methodological prerequisites required for successful implementation. Additionally, aspects of data quality, the FAIRness (Findability, Accessibility, Interoperability, Reusability) of the extracted data, as well as the associated costs, will be considered and reflected upon.

Related Organizations

University of Basel
Switzerland

Keywords

LLM, GPT, image, SSH, data extraction, MLLM

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	0
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

0

Average

Green