publication . Preprint . 2014

broom: An R Package for Converting Statistical Analysis Objects Into Tidy Data Frames

Robinson, David;
Open Access English
  • Published: 11 Dec 2014
Abstract
The concept of "tidy data" offers a powerful framework for structuring data to ease manipulation, modeling and visualization. However, most R functions, both those built-in and those found in third-party packages, produce output that is not tidy, and that is therefore difficult to reshape, recombine, and otherwise manipulate. Here I introduce the broom package, which turns the output of model objects into tidy data frames that are suited to further analysis, manipulation, and visualization with input-tidy tools. Broom defines the "tidy", "augment" and "glance" generics, which arrange a model into three levels of tidy output respectively: the component level, the...
Subjects
ACM Computing Classification System: GeneralLiterature_MISCELLANEOUS
free text keywords: Statistics - Computation, Statistics - Methodology, 62-07, G.3, I.6
Funded by
NIH| Statistical Methods for High-Throughput Gene Expression Profiling
Project
  • Funder: National Institutes of Health (NIH)
  • Project Code: 2R01HG002913-10A1
  • Funding stream: NATIONAL HUMAN GENOME RESEARCH INSTITUTE
Download from

D. Bates, M. Maechler, B. Bolker, and S. Walker. lme4: Linear mixed-effects models using Eigen and S4., 2014. URL http://CRAN.R-project.org/package=lme4. R package version 1.1-7.

T. Dasu and T. Johnson. Exploratory data mining and data cleaning. John Wiley, 2003. [OpenAIRE]

B. Efron and R. J. Tibshirani. An introduction to the bootstrap, volume 57. CRC press, 1994. [OpenAIRE]

J. Friedman, T. Hastie, and R. Tibshirani. Regularization paths for generalized linear models via coordinate descent. Journal of statistical software, 2010.

M. Friendly. Lahman: Sean Lahman's Baseball Database, http://CRAN.R-project.org/package=Lahman. R package version 3.0-1.

T. Hothorn, F. Bretz, and P. Westfall. Simultaneous inference in general parametric models. Biometrical Journal, 50(3):346-363, 2008.

E. Rahm and H. H. Do. Data cleaning: Problems and current approaches. IEEE Data Eng Bull, 2000. [OpenAIRE]

Therneau. A Package for Survival Analysis in S, 2014.

URL:http://CRAN.R-project.org/package=survival. R package version 2.37-7.

Related research
Abstract
The concept of "tidy data" offers a powerful framework for structuring data to ease manipulation, modeling and visualization. However, most R functions, both those built-in and those found in third-party packages, produce output that is not tidy, and that is therefore difficult to reshape, recombine, and otherwise manipulate. Here I introduce the broom package, which turns the output of model objects into tidy data frames that are suited to further analysis, manipulation, and visualization with input-tidy tools. Broom defines the "tidy", "augment" and "glance" generics, which arrange a model into three levels of tidy output respectively: the component level, the...
Subjects
ACM Computing Classification System: GeneralLiterature_MISCELLANEOUS
free text keywords: Statistics - Computation, Statistics - Methodology, 62-07, G.3, I.6
Funded by
NIH| Statistical Methods for High-Throughput Gene Expression Profiling
Project
  • Funder: National Institutes of Health (NIH)
  • Project Code: 2R01HG002913-10A1
  • Funding stream: NATIONAL HUMAN GENOME RESEARCH INSTITUTE
Download from

D. Bates, M. Maechler, B. Bolker, and S. Walker. lme4: Linear mixed-effects models using Eigen and S4., 2014. URL http://CRAN.R-project.org/package=lme4. R package version 1.1-7.

T. Dasu and T. Johnson. Exploratory data mining and data cleaning. John Wiley, 2003. [OpenAIRE]

B. Efron and R. J. Tibshirani. An introduction to the bootstrap, volume 57. CRC press, 1994. [OpenAIRE]

J. Friedman, T. Hastie, and R. Tibshirani. Regularization paths for generalized linear models via coordinate descent. Journal of statistical software, 2010.

M. Friendly. Lahman: Sean Lahman's Baseball Database, http://CRAN.R-project.org/package=Lahman. R package version 3.0-1.

T. Hothorn, F. Bretz, and P. Westfall. Simultaneous inference in general parametric models. Biometrical Journal, 50(3):346-363, 2008.

E. Rahm and H. H. Do. Data cleaning: Problems and current approaches. IEEE Data Eng Bull, 2000. [OpenAIRE]

Therneau. A Package for Survival Analysis in S, 2014.

URL:http://CRAN.R-project.org/package=survival. R package version 2.37-7.

Related research
Powered by OpenAIRE Open Research Graph
Any information missing or wrong?Report an Issue