This current PhD thesis aims at researching solutions to the long-standing gap in the research infrastructure field of formal modelling and discovery of information. Research Infrastructures (RIs) are facilities, resources and services used by the research community to conduct research and promote innovation. They provide access to tools, data and algorithms, pooled on working environments named Virtual Research Environments (VREs). Due to their nature, Research Infrastructures (RIs) are a heterogeneous landscape of resource types and diverse, often redundant, models for describing the same resource entities. Such diversity drastically diminishes the capabilities of a consistent fruition of the knowledge and skills available within an infrastructure, and ultimately limits the creation of federations of RI. The current state of knowledge in the field is even more complex due to the emerging concept of Hybrid Data Infrastructures (HDIs), which combine on-premises resources with resources provisioned by cloud and/or grid computing infrastructures. The focus of this study is to address the critical aspects of this heterogeneity and provides to the research infrastructure field both the conceptual and analytical instruments for expanding the scale at which the infrastructures currently operate. The following research questions are investigated in this study: - Research question 1: Is it possible to formalise a uniform resource model to seamlessly federate research infrastructures? - Research question 2: To what extent do state-of-the-art discovery algorithms need to be amended to properly handle VREs across Hybrid Data Infrastructures? - Research question 3: To what extent does the interface of a discovery service can support the new model and algorithms? First objective is attained with the definition of a resource model open by design. An accurate mathematical notation presents a minimal but powerful model consisting of few constructs. This is an abstract model, which can be instantiated to accommodate almost every possible scenario. Second objective is confronted with the design of a novel communication model based on the concept of transaction. Transactional models are not a new idea; rather they have been extensively described in the literature since the 80s. However, this work identifies a misapplication of these models in the context of research infrastructure and proposes changes to support the emerging requirements in the field. Lastly, the third objective refers to the impact that the first two questions have on architectural aspects of the services devoted to distribute the information in a research infrastructure. This point is deeply investigated with a detailed design of a production-ready solution to prove the validity and accuracy of the theoretical aspects of this thesis. The originality of the proposed thesis consists also in its methodology. In fact, this dissertation offers an innovative analytical and methodological approach to address these research questions. A key-enabling component of modern Hybrid Data Infrastructures, the Information System (IS), is used as platform to validate this work. An IS is central to all the themes tackled in this thesis. It exploits the resource model, plays the role of registry inside the infrastructure, deals with an intensive communication with all the other actors and works with the heterogeneity of such a context. Its ultimate goal is to provide uniform views of the resources assigned to each VRE, regardless of their origin. The study presented in this thesis has been conducted within the InfraScience group, the research group pioneering the Enabling Scientific Data Infrastructures field in the past two decades. The group operates at the Institute of Information Science and Technologies (ISTI), an institute of the Italian National Research Council (CNR). This invaluable environment provided a unique ground to develop and validate the theories illustrated in the thesis. Focusing on the validation, the abstract resource model has been instantiated in the gCube model of the D4Science HDI, while the transactional model has been translated in the design of the IS, successfully implemented and adopted by the same HDI. The dissertation carries innovative theoretical, methodological and analytical value in the research infrastructure field and especially in the emerging Hybrid Data Infrastructures. Importantly, this work has been validated in one of the most challenging contexts available. The thesis is organised in chapters: Chapters 1 and 2 describe the problem statement; Chapter 3 provides a state-of-the-art overview of the problem; Chapter 4 develops the theoretical framework by describing both the abstract and gCube resource models; Chapters 5 presents transaction model, how it can be used, and why it focuses on the transaction technique; Chapter 6 offers the core methodological support with the description of the IS to support the abstract resource model and the transaction model and; Chapters 7 and 8 conclude with the validation of proposed study and the contribution to the literature it provides.