Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ UNSWorksarrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
UNSWorks
Master thesis . 2019
License: CC BY NC ND
https://dx.doi.org/10.26190/un...
Master thesis . 2019
License: CC BY NC ND
Data sources: Datacite
versions View all 1 versions
addClaim

Optimising intrinsic disorder prediction for short linear motif discovery

Authors: Paulsen, Kirsti;

Optimising intrinsic disorder prediction for short linear motif discovery

Abstract

Short linear motifs (SLiMs) are short protein regions, commonly only 3 - 10 amino acids in length, that are directly involved in protein-protein interactions. Identification of SLiMs is important for understanding fundamental processes involved in normal cellular function. SLiMs interact with their partnering proteins with low affinity. This makes them difficult to identify experimentally; as a result, many computational SLiM prediction methods have been developed. Because SLiMs typically have only a few defined positions, random non-functional sequences that matches a SLiM sequence pattern are ubiquitous in any proteome. The main challenge in computational SLiM prediction is to identify the true positive SLiMs (“signal”) amongst the much more abundant false positive motif matches (“noise”). To increase the signal to noise ratio, different sequence masking techniques are applied to attempt to screen out protein regions that are unlikely to contain real SLiMs and thereby preferentially eliminating only random non-functional sequence matches from the data. SLiMs are typically found in regions of intrinsic disorder, hence a widely implemented masking strategy is to use predictors of intrinsic protein disorder to identify and remove protein regions that form stable three-dimensional structures. However, to date, there has been no systematic study on how best to predict intrinsic disorder for SLiM discovery. In this study, I investigate the relative performance of the ten disorder prediction methods implemented in the MobiDB database, along with the functional disorder predictor ANCHOR. The SLiM prediction program SLiMProb was used to predict instances of known SLiMs across the human proteome, and SLiMFinder was used to predict novel SLiM patterns. The benchmarking program SLiMBench was used to evaluate the performance of the different input masking strategies based on disorder predictors, and to identify the optimal settings for SLiM occurrence prediction and for de novo SLiM prediction. This study shows that while all disorder prediction methods improve both SLiM occurrence prediction and de novo SLiM prediction, they do so with varying quality. Additionally, regional smoothing of disorder predictions prior to masking was found to further improve SLiM discovery. These results will be useful for guiding future SLiM discovery efforts.

Country
Australia
Related Organizations
Keywords

Protein sequence analysis, 610, Big data analysis

  • BIP!
    Impact byBIP!
    selected citations
    These citations are derived from selected sources.
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    0
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
Powered by OpenAIRE graph
Found an issue? Give us feedback
selected citations
These citations are derived from selected sources.
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
0
Average
Average
Average
Green