Downloads provided by UsageCounts
Phylogenetic trees include errors for a variety of reasons. We argue that one way to detect errors is to build a phylogeny with all the data and then detect taxa that artificially inflate the tree diameter. We formulate an optimization problem that seeks to find k leaves that can be removed to reduce the tree diameter maximally. We present a polynomial time solution to this “k-shrink” problem. Given this solution, we then use non-parametric statistics to find an outlier set of taxa that have an unexpectedly high impact on the tree diameter. We test our method, TreeShrink, on five biological datasets, and show that it is more conservative than rogue taxon removal using RogueNaRok. When the amount of filtering is controlled, TreeShrink outperforms RogueNaRok in three out of the five datasets, and they tie in another dataset.
All the raw data are obtained from other publications as shown below. We further analyzed the data and provide the results of the analyses here. The methods used to analyze the data are described in the paper. Dataset Species Genes Download Plants 104 852 DOI 10.1186/2047-217X-3-17 Mammals 37 424 DOI 10.13012/C5BG2KWG Insects 144 1478 http://esayyari.github.io/InsectsData Cannon 78 213 DOI 10.5061/dryad.493b7 Rouse 26 393 DOI 10.5061/dryad.79dq1 Frogs 164 95 DOI 10.5061/dryad.12546.2
FOS: Biological sciences, Phylogenomics
FOS: Biological sciences, Phylogenomics
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
| views | 4 | |
| downloads | 5 |

Views provided by UsageCounts
Downloads provided by UsageCounts