
pmid: 17988949
The combination of results from different large-scale datasets of multidimensional biological signals (such as gene expression profiling) presents a major challenge. Methodologies are needed that can efficiently combine diverse datasets, but can also test the extent of diversity (heterogeneity) across the combined studies. We developed METa-analysis of RAnked DISCovery datasets (METRADISC), a generalized meta-analysis method for combining information across discovery-oriented datasets and for testing between-study heterogeneity for each biological variable of interest. The method is based on non-parametric Monte Carlo permutation testing. The tested biological variables are ranked in each study according to the level of statistical significance. METRADISC tests for each biological variable of interest its average rank and the between-study heterogeneity of the study-specific ranks. After accounting for ties and differences in tested variables across studies, we randomly permute the ranks of each study and the simulated metrics of average rank and heterogeneity are calculated. The procedure is repeated to generate null distributions for the metrics. The use of METRADISC is demonstrated empirically using gene expression data from seven studies comparing prostate cancer cases and normal controls. We offer a new tool for combining complex datasets derived from massive testing, discovery-oriented research and for examining the diversity of results across the combined studies.
Male, Gene Expression Profiling/*statistics & numerical data, Applications of statistics to biology and medical sciences; meta analysis, Meta-Analysis as Topic, Medical applications (general), Humans, *Models, Statistical, Databases, Protein, Nonparametric hypothesis testing, Oligonucleotide Array Sequence Analysis, Models, Statistical, Gene Expression Profiling, Prostatic Neoplasms, Monte Carlo methods, Prostatic Neoplasms/metabolism, *Algorithms, *Meta-Analysis as Topic, *Oligonucleotide Array Sequence Analysis, gene expression, Databases, Protein/*statistics & numerical data, heterogeneity, ranks, Monte Carlo Method, Algorithms
Male, Gene Expression Profiling/*statistics & numerical data, Applications of statistics to biology and medical sciences; meta analysis, Meta-Analysis as Topic, Medical applications (general), Humans, *Models, Statistical, Databases, Protein, Nonparametric hypothesis testing, Oligonucleotide Array Sequence Analysis, Models, Statistical, Gene Expression Profiling, Prostatic Neoplasms, Monte Carlo methods, Prostatic Neoplasms/metabolism, *Algorithms, *Meta-Analysis as Topic, *Oligonucleotide Array Sequence Analysis, gene expression, Databases, Protein/*statistics & numerical data, heterogeneity, ranks, Monte Carlo Method, Algorithms
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 40 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Top 10% | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Top 10% | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Top 10% |
