Correlated datasets develop when multiple observations are collected from a sampling unit (e.g., repeated measures of a bank over time, or hormone levels in a breast cancer patient over time), or from clustered data where observations are grouped based on a shared characteristic (e.g., observations on different banks grouped by zip code, or on cancer patients from a specific clinic). The generalized linear model framework for independent data is extended to model correlated data via the introduction of second-order variance components directly into the independent data model's estimating equation. This generalization of the estimating equation from the independence model is thus referred to as a Generalized Estimating Equation (GEE). This article discusses the foundation of GEEs as well as how user-specified correlation structures are accommodated in the model-building process. This article also discusses the relationship and similarity to the underlying generalized linear model framework and we point out alternative approaches to GEEs for modeling correlated data such as fixed-effects models and random-effects models. Keywords: working correlation matrix; sandwich estimate of variance; generalized linear models; subject-specific models; population-averaged models

Related Organizations

Arizona State University
United States
University of South Carolina System
United States
University of South Carolina
United States

Impact byBIP!

	citations This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	2K
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Top 0.1%
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Top 0.1%
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Top 1%