Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ ZENODOarrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Dataset . 2020
License: CC BY
Data sources: Datacite
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Dataset . 2020
License: CC BY
Data sources: Datacite
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Dataset . 2020
License: CC BY
Data sources: ZENODO
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
image/svg+xml Jakob Voss, based on art designer at PLoS, modified by Wikipedia users Nina and Beao Closed Access logo, derived from PLoS Open Access logo. This version with transparent background. http://commons.wikimedia.org/wiki/File:Closed_Access_logo_transparent.svg Jakob Voss, based on art designer at PLoS, modified by Wikipedia users Nina and Beao
versions View all 4 versions
addClaim

Data for "Current challenges for unseen-epitope TCR interaction prediction and a new perspective derived from image classification" (ImRex)

Authors: Pieter Moris; Joey De Pauw; Anna Postovskaya; Sofie Gielis; Nicolas De Neuter; Wout Bittremieux; Benson Ogunjimi; +2 Authors

Data for "Current challenges for unseen-epitope TCR interaction prediction and a new perspective derived from image classification" (ImRex)

Abstract

Repository containing the different experiments described in the manuscript titled: "Current challenges for epitope-agnostic TCR interaction prediction and a new perspective derived from image classification". Publication DOI: TBA Originally appeared as a preprint on bioRxiv: https://doi.org/10.1101/2019.12.18.880146. Contains: Trained model files (.h5) Associated train and validation datasets for each model. Learning curves and evaluation metrics. Log files with training and data arguments (full training scripts are available in GitHub repository). Comparisons between different models. Complete raw and processed datasets (also available in the associated GitHub repository). Please refer to the associated GitHub repository (https://github.com/pmoris/ImRex) for more information on the directory structure and contents, as well as the scripts that generated these output files. Contents: data.zip: Contains raw and preprocessed datasets. READMEs in subdirectory describe the data sources and preprocessing steps. Please refer to the associated GitHub repository for the specific scripts that generated these files. Note that the full training and test sets (i.e. containing both positive and negative examples) are stored separately for each model/CV iteration in the models archives. models-main.zip: contains the trained models and evaluation metrics for the main different experiments described in the bash and pbs scripts in ./src/scripts/hpc_scripts. Log files for the experiments outlined here can be found in ./src/scripts/hpc_scripts. models-full.zip: contains models that were trained on the complete VDJdb dataset without cross-validation, filtered on human TRB data, no 10x data and restricted to 10-20 (CDR3) or 8-11 (epitope) amino acid residues, with negatives that were generated by shuffling (i.e. sampling an negative epitope for each positive CDR3 sequence). One set of models uses downsampling to reduce the most abundant epitopes down to 400 pairs each, the other one does not use any downsampling. These models were also used for evaluating on the external Adaptive dataset, as outlined in ./src/scripts/evaluate/evaluate_adaptive.sh, and the TRA subset of sequences (./src/scripts/evaluate/evaluate_tra.sh). models-decoyfit.zip: contains models that were trained on true data, but evaluated on data where epitopes were replaced by decoys. models-padded-epitoperatio.zip: contains a quick test of trained models (padded/interaction map) that use a different type of negative shuffling, see docstrings in ./src/processing/negative_sampler.py for more info. models-repeat-local.zip: contains a number of repeated runs from models-main, used to estimate variability in model performance for multiple identical runs. comparisons.zip: contains comparison directories, each consisting of two or more model output directories, that contrast the performance metrics of the models. These outputs were generated by using the ./src/scripts/evaluate/visualize.py script, or by using the oneliners in ./src/scripts/evaluate/visualise.sh, which can operate on the entire comparisons directory at once. Note that any file paths described here are in reference to the associated GitHub repository (https://github.com/pmoris/ImRex). Overview of different experiments: Two main architectures were compared: the interaction map (or padded) CNN and a dual input CNN based on NetTCR (nettcr). Two different cross-validation strategies were used: a 5x repeated 5-fold CV (repeated5fold) and an epitope-grouped CV (epitope_grouped). The different dataset subsets are labelled as follows. Check the Makefile's preprocess-vdjdb-aug-2019 command (and the underlying script ./src/scripts/preprocessing/preprocess_vdjdb.py) for a more thorough overview of the different filtering options. mhci: only MHCI class presented epitopes. trb: only TRB CDR3 sequences. tra: only TRA CDR3 sequences. tratrb: both types of CDR3 sequences. down: moderate downsampling of most abundant epitopes to 1000 pairs. down400: strong downsampling of most abundant epitopes to 400 pairs. decoy: decoy epitope data. reg001: regularization factor 0.01 (only for padded/interaction type models, fixed value) Two different methods of generating negative TCR-epitope pairs were used: shuffling of positive pairs, i.e. sampling a single epitope from the positive pairs for each CDR3 sequence (shuffle), and sampling CDR3s from a reference repertoire (negref). The batch size is labelled as b32 = a batch size of 32. The learning rate was always 0.0001 (lre4) or 0.001 (lre3).

Country
Belgium
Keywords

Chemistry, epitope, TCR-epitope interaction, convolutional neural network, T-cell receptor, Biology, immunoinformatics, Mathematics, molecular interaction prediction

  • BIP!
    Impact byBIP!
    selected citations
    These citations are derived from selected sources.
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    1
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
    OpenAIRE UsageCounts
    Usage byUsageCounts
    visibility views 52
    download downloads 7
  • 52
    views
    7
    downloads
    Powered byOpenAIRE UsageCounts
Powered by OpenAIRE graph
Found an issue? Give us feedback
visibility
download
selected citations
These citations are derived from selected sources.
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
views
OpenAIRE UsageCountsViews provided by UsageCounts
downloads
OpenAIRE UsageCountsDownloads provided by UsageCounts
1
Average
Average
Average
52
7