
This dataset provides the evaluation framework used in a systematic literature review of methods for extracting and aligning data from tables and charts in scientific publications. The evaluation framework captures structured annotations for 68 peer-reviewed studies, covering tasks, modalities (tables, charts, multimodal), benchmarks, models, architectures, evaluation metrics, and support for variable identification, alignment, and reconstruction. The evaluation framework is designed to support reproducibility, comparative analysis, and meta-research on multimodal document understanding. It enables quantitative and qualitative analysis of trends, open challenges, and methodological gaps in table and chart extraction, including multimodal alignment, variable and value association, and benchmark reuse. This resource can be reused by researchers developing extraction systems, benchmarking multimodal models, or studying the state of the art in scientific document analysis, particularly in the context of structured data extraction from tables and charts.
Document Analysis, benchmarks and datasets, Artificial Intelligence, Computer Science, Information Retrieval, chart extraction, multimodal information extraction, table extraction, scientific document analysis
Document Analysis, benchmarks and datasets, Artificial Intelligence, Computer Science, Information Retrieval, chart extraction, multimodal information extraction, table extraction, scientific document analysis
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
