Powered by OpenAIRE graph
Found an issue? Give us feedback
ZENODOarrow_drop_down
ZENODO
Dataset . 2024
License: CC BY
Data sources: Datacite
ZENODO
Dataset . 2024
License: CC BY
Data sources: Datacite
versions View all 2 versions
addClaim

This Research product is the result of merged Research products in OpenAIRE.

You have already added 0 works in your ORCID record related to the merged Research product.

AI4PROFHEALTH - Profession-health status co-occurrence graph statistics

Authors: Rodríguez-Ortega, Miguel; Marsol Torrent, Sergi; Farré-Maduell, Eulàlia; Krallinger, Martin; Lima-López, Salvador; Becerra-Tomé, Alberto; Rodríguez Miret, Jan;

AI4PROFHEALTH - Profession-health status co-occurrence graph statistics

Abstract

This dataset contains the Pointwise Mutual Information (PMI) values for co-occurrence pairs between different mention categories extracted from two distinct clinical datasets: MESINESP2 and the Clinical Case Reports Collection. PMI is a statistical measure used to assess the strength of association between pairs of entities by comparing their observed co-occurrence to the expected frequency under the assumption of independence. The datasets include PMI values for each co-occurrence pair, derived from the association of professions and clinical concepts, with the aim of identifying potential occupational health risks. By sharing these datasets, we aim to support further research into the relationships between professions and clinical entities, enabling the development of more accurate and targeted occupational health risk models. There is a separate file for each corpus, and each dataset is provided in CSV format for easy access and analysis. These files include the PMI values for co-occurrence pairs extracted from the respective corpora, making them suitable for further data analysis. Data Structure: MESINESP2: mesinesp2_co-occurrence_pmi.zip Clinical case reports: clinical_cases_co-occurrence_pmi.zip The repository contains a .zip file for each of the corpus, each containing a .csv file with the co-occurrences between the detected professions and clinical entities. The file has the following columns order: span_mention_1: Mention string (original): profession normalized_entity_1: Controlled vocabulary entry for this term mention1_category: Semantic class (i.e., NER label) mention1_freq: Absolute frequency of this mention entity 1 span_mention_2: Mention string (original): entity 2 (disease, symptom, species, etc.) normalized_entity_2: Controlled vocabulary entry for this term mention2_category: Semantic class (i.e., NER label) mention1_freq: Absolute frequency of this mention entity 2 co-occurrence: Number of co-occurrences PMID: PMID value Notes This resource been funded by the Spanish National Proyectos I+D+i 2020 AI4ProfHealth project PID2020-119266RA-I00 (PID2020-119266RA-I0/AEI/10.13039/501100011033). Contact If you have any questions or suggestions, please contact us at: - Miguel Rodríguez Ortega ()- Martin Krallinger () Additional resources and corpora If you are interested, you might want to check out these corpora and resources: MEDDOPROF (Corpus of mentions of professions, occupations and working status and normalization, different document collection with some overlapping documents) MESINESP-2 (Corpus of manually indexed records with DeCS /MeSH terms comprising scientific literature abstracts, clinical trials, and patent abstracts, different document collection)

This resource been funded by the Spanish National Proyectos I+D+i 2020 AI4ProfHealth project PID2020-119266RA-I00 (PID2020-119266RA-I0/AEI/10.13039/501100011033).

Related Organizations
  • BIP!
    Impact byBIP!
    citations
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    0
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
Powered by OpenAIRE graph
Found an issue? Give us feedback
citations
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
0
Average
Average
Average