As one of the major technological concepts driving ICT development today, big data has been touted as offering new forms of analysis of research data. Its application has reached out across disciplines but some research sources and archival practices do not sit comfortably within the computational turn and this has sparked concerns that cultural heritage collections that cannot be structured, represented, or, indeed, digitised accordingly may be excluded and marginalised by this new paradigm. This work-in-progress paper reports on the contribution of the KPLEX project's knowledge complexity approach to understanding the relationship between big data and archival practice.
This report on Data, Knowledge Organisation, and Epistemic Impact covers the findings of WP 4 of the K-PLEX project. It focuses on data collection, production, and analysis in a broad range of scientific disciplines, on epistemologies and methodologies, and research organisation. The cross-disciplinary research topic “emotions” has been chosen to ensure comparability across disciplines and to investigate different epistemic cultures. Findings are based on a survey with 123 responses and 15 expert interviews. Results show the heterogeneity of research approaches and epistemic dissonances resulting from a broad variety of epistemic cultures in emotion research. Datafication – the rendering of real-world phenomena into data – inevitably leads to a reduction of complexity of the research object “emotions”. This simplification results from the limitations imposed by the epistemologies and the biases inherent to methodological decisions. The dissection into various disciplines and epistemic cultures and the challenges of interdisciplinarity further the marginalisation of complexity. Interdisciplinarity in emotion research was deemed as both beneficial and demanding. While interdisciplinary research projects were seen to be fruitful on a theoretical and conceptual level, the development of research methodologies that enable data structures which can be aggregated into larger data sets proved to be challenging. Data structures are designed according to methodological requirements and not to ensure reusability. Structural factors like the difficulties of research organisation in large-scale interdisciplinary research units, or the lack of high-ranked journals publishing interdisciplinary results further impede such research endeavours. Data cannot be seen independently from the context in which they were constructed and collected. The narrower context of the research setting and of the researcher as well as the wider contexts of the historical, political, social, cultural and linguistic circumstances of data collection have thus to be considered. The omission of contexts and the lack of comprehensive theoretical frameworks form considerable barriers to data aggregation and have consequences for data storage, sharing and reuse. A multiplicity of epistemologies and methodologies leads to a plurality of data and metadata formats and to a reduced acceptance of standard formats like the W3C standard EmotionML. In the case of data on emotions, further barriers are formed by legal restrictions or ethical issues in data sharing. Research participants showed cautiousness with respect to Big Data opening up new research possibilities. Big Data are not collected according to a specific research question or methodology and are thus antecedent to the epistemological process. This can be seen as a major difference between Big Data and research data. Moreover, Big Data are investigated in an exploratory process dominated by serendipitous findings, an approach that runs counter to scientists’ conception of a steered navigation of the research process. Concise recommendations on how these conflicting epistemologies could be combined in terms of integrative datafication standards, infrastructure and methodologies are outlined.
Description: Data Management Plan lists all the data that will be collected, processed and generated, within the project and details all rules, regulations and provisions connected to handling this data.