Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ ZENODOarrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Dataset . 2023
License: CC 0
Data sources: ZENODO
DRYAD
Dataset . 2023
License: CC 0
Data sources: Datacite
versions View all 2 versions
addClaim

This Research product is the result of merged Research products in OpenAIRE.

You have already added 0 works in your ORCID record related to the merged Research product.

The benefit of augmenting open data with clinical data-warehouse EHR for forecasting SARS-CoV-2 hospitalizations in Bordeaux area, France

Authors: Ferté, Thomas; Jouhet, Vianney; Griffier, Romain; Hejblum, Boris; Thiébaut, Rodolphe; Bordeaux University Hospital Covid-19 Crisis Task Force, ,;

The benefit of augmenting open data with clinical data-warehouse EHR for forecasting SARS-CoV-2 hospitalizations in Bordeaux area, France

Abstract

Objective The aim of this study was to develop an accurate regional forecast algorithm to predict the number of hospitalized patients and to assess the benefit of the Electronic Health Records (EHR) information to perform those predictions. Materials and Methods Aggregated data from SARS-CoV-2 and weather public database and data warehouse of the Bordeaux hospital were extracted from May 16, 2020, to January 17, 2022. The outcomes were the number of hospitalized patients in the Bordeaux Hospital at 7 and 14 days. We compared the performance of different data sources, feature engineering, and machine learning models. Results During the period of 88 weeks, 2561 hospitalizations due to COVID-19 were recorded at the Bordeaux Hospital. The model achieving the best performance was an elastic-net penalized linear regression using all available data with a median relative error at 7 and 14 days of 0.136 [0.063; 0.223] and 0.198 [0.105; 0.302] hospitalizations, respectively. Electronic health records (EHRs) from the hospital data warehouse improved median relative error at 7 and 14 days by 10.9% and 19.8%, respectively. Graphical evaluation showed remaining forecast error was mainly due to delay in slope shift detection. Discussion Forecast models showed overall good performance both at 7 and 14 days which was improved by the addition of the data from Bordeaux Hospital data warehouse. Conclusions The development of hospital data warehouses might help to get more specific and faster information than traditional surveillance systems, which in turn will help to improve epidemic forecasting at a larger and finer scale.

Aggregated data from 2020-05-16 to 2022-01-17 regarding Bordeaux Hospital EHR. Bordeaux hospital data warehouse was used, during the pandemic, to describe the current state of the epidemic at the hospital level on a daily basis. Those data were then used in the forecast model including: hospitalizations, hospital and ICU admission and discharge, ambulance service notes and emergency unit notes. Concepts related to COVID-19 were extracted from notes by dictionary-based approaches (e.g. cough, dyspnoea, covid-19). Dictionaries were manually created based on manual chart review to identify terms used by practitioners. Then, the number and proportion of ambulance service calls or hospitalization in emergency units mentioning concepts related to covid-19 were extracted. Due to different data acquisition mechanisms, there was a delay between the occurrence of events and the data acquisition. It was of 1 day for EHR data, 5 days for department hospitalizations and RT-PCR, 4 days for weather, 2 days for variants and 4 days for vaccination. For the training and evaluation of the model, the chosen date was the date of data availability to mimic a real-time streaming forecast.

Data are stored in a .rdata file. Please use R (https://www.r-project.org/) software to open the data.

Keywords

Data Warehouse, machine learning, electronic health records, SARS-CoV-2, Machine learning, FOS: Health sciences, Forecasting

  • BIP!
    Impact byBIP!
    citations
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    1
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
    OpenAIRE UsageCounts
    Usage byUsageCounts
    visibility views 4
  • 4
    views
    Powered byOpenAIRE UsageCounts
Powered by OpenAIRE graph
Found an issue? Give us feedback
visibility
citations
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
views
OpenAIRE UsageCountsViews provided by UsageCounts
1
Average
Average
Average
4
Related to Research communities