The Assumptions of the Linear Regression Model

Michael A. Poole; Patrick N. O'Farrell

Found an issue? Give us feedback

Transactions of the ...arrow_drop_down

Transactions of the Institute of British Geographers

Article . 1971 . Peer-reviewed

Data sources: Crossref

https://dx.doi.org/10.2307/621...

Article

Data sources: Microsoft Academic Graph

The Assumptions of the Linear Regression Model

descriptionPublicationkeyboard_double_arrow_right Article 01 Mar 1971Publisher:JSTORJournal:Transactions of the Institute of British Geographers (issn: 0020-2754,

Copyright policy )

Authors: Michael A. Poole; Patrick N. O'Farrell;

doi: 10.2307/621706

The Assumptions of the Linear Regression Model

- Summary
- Metrics

Abstract

The paper is prompted by certain apparent deficiences both in the discussion of the regression model in instructional sources for geographers and in the actual empirical application of the model by geographical writers. In the first part of the paper the assumptions of the two regression models, the 'fixed X' and the 'random X', are outlined in detail, and the relative importance of each of the assumptions for the variety of purposes for which regression analysis may be employed is indicated. Where any of the critical assumptions of the model are seriously violated, variations on the basic model must be used and these are reviewed in the second half of the paper. THE rapid increase in the employment of mathematical models in planning has led R. J. Colenutt to discuss 'some of the problems and errors encountered in building linear models for prediction'.1 Colenutt rightly points out that the mathematical framework selected for such models 'places severe demands on the model builder because it is associated with a highly restrictive set of assumptions . . . and it is therefore imperative that, if simple linear models are to be used in planning, their limitations should be clearly understood'.2 These models have also been widely used in geography, for descriptive and inferential purposes as well as for prediction, and there is abundant evidence that, like their colleagues in planning, many geographers, when employing these models, have not ensured that their data satisfied the appropriate assumptions. Thus many researchers appear to have employed linear models either without verifying a sufficient number of assumptions or else after performing tests which are irrelevant because they relate to one or more assumptions not required by the model. Furthermore, many writers, reportipg geographical research, have completely omitted to indicate whether any of the assumptions have been satisfied. This last group is ambiguous, and it is clearly not possible, unless the values of the variables are published, to judge whether the correct set of assumptions has been tested or, indeed, to ascertain whether any such testing has been performed at all. This problem partially arises from certain shortcomings in material which has been published with the specific objective, at least inter alia, of instructing geographers on the use of quantitative techniques. All of these sources make either incomplete or inaccurate specifications of the assumptions underlying the application of linear models, although it is encouraging to note that there has been a considerable improvement in the quality of this literature in recent years. Thus, there were four books and two articles published in the early and mid-Ig60s which may be classified as belonging to this body of literature,3 yet, in five of these six sources, only one of the assumptions of the model is mentioned and, even

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	205
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Top 1%
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Top 1%
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Top 10%