Electoral inquiry sectionLoess:: a nonparametric, graphical tool for depicting relationships between variables☆
Introduction
The purpose of this paper is to discuss the loess procedure for fitting smooth curves to scatterplots. Loess provides a graphical summary of the relationship between a dependent variable and one or more independent variables. The distinctive feature of this procedure is that it “allows the data to speak for themselves”. Loess is nonparametric, so the fitted curve is obtained empirically rather than through stringent prior specifications about the nature of any structure that may exist within the data. Therefore, loess-enhanced scatterplots often reveal relatively complex relationships that could easily be overlooked with traditional statistical modeling procedures.
Loess and other nonparametric estimation strategies are useful in social scientific research because current substantive theories usually provide little detail about the kinds of structural patterns that should exist within empirical data. In other words, hypotheses suggest which variables should be related to each other, and often, the direction of any such relationships: For example, “education levels should be positively related to voting turnout”. Beyond statements like this, however, there are generally no predictions about functional forms. Researchers therefore fall back on simple specifications, for want of theory-based directions to the contrary — a situation that Beck and Jackman (1998) have recently called “linearity by default”. This creates a potentially serious problem because those detailed theories which do exist suggest that nonlinear relationships are pervasive throughout the field of elections, voting, and mass political behavior (e.g. Przeworski and Soares, 1971, Zaller, 1992, Brown, 1995). Thus, a nonparametric technique like loess should be very useful for discerning such nonlinearities and explicating their forms.
The rest of this paper provides a detailed presentation of the loess method, along with the major practical considerations involved in its use. Most of the discussion will focus on the simplest case — using loess as a descriptive, exploratory tool for fitting smooth curves to scatterplots. This is undoubtedly the kind of situation where loess is employed most frequently. However, the technique is much more general than this. So, some attention will also be given to statistical inference and multivariate loess. Overall, loess is a very useful tool for discerning systematic structure within empirical data. As such, this technique should help researchers develop theories that provide accurate, powerful representations of real-world phenomena.
Section snippets
Scatterplot smoothing
The two-dimensional scatterplot is the basic graphical display method for bivariate data. At the same time, the scatterplot is the “building block” for more complex graphical depictions of multivariate data (Jacoby, 1998). One of the great strengths of the scatterplot is that it enables visual assessments of relationships or functional dependencies between the variables included in the display.
An example of loess smoothing
In order to demonstrate the utility of the loess procedure, we will examine a substantive example, using state-level data on education and voter turnout in the 1992 American presidential election. This is an ideal topic for our present purposes, because it epitomizes the ambiguities that often exist in our theoretical propositions. The relationship between education and mass political participation is widely acknowledged by social scientists. However, Nie et al. (1996) point out that even
Fitting a loess smooth curve
The loess procedure is computationally intensive; in other words, there is a large number of distinct steps involved in fitting even a simple loess curve to a small dataset. Nevertheless, the calculations themselves are straightforward. They should be readily understandable to anyone who is familiar with ordinary least squares regression analysis. The discussion in this section will provide a brief overview of the methodology underlying loess. Complete details and a simple, step-by-step example
Fitting parameters for the loess smooth curve
The loess procedure is nonparametric in the sense that the analyst does not specify the functional form of the final smooth curve. However, there are some parameters that must be supplied prior to the fitting procedure in order to guarantee that the loess curve really does pass through the center of the empirical data points. Selecting the values for these parameters is a subjective process, but the considerations that are involved in the decisions are quite straightforward.
Plotting loess residuals
The residuals from a loess fit can be employed as a useful diagnostic tool in order to determine whether the smooth curve adequately incorporates all of the interesting structure in the data. The strategy for doing so is identical to that used in traditional, linear regression analysis. The residuals are scrutinized for systematic patterns that may remain after an hypothesized structural representation has been fitted to the empirical data.
The loess residuals are defined as the difference
Goodness of fit for a loess smooth curve
When a loess smooth curve is fitted to data, attention is usually focused on the shape of the resultant curve because that feature is most revealing of the structure within the data. However, it is also useful to consider how well the smooth curve characterizes the empirical data values. This latter phenomenon is usually called ‘goodness of fit’, although that term is only partially appropriate in the case of nonparametric smoothers like loess.
A summary fit statistic similar to an R2 value can
Loess and statistical inference
The discussion so far has assumed that loess is being used as a strictly descriptive tool. However, the statistical theory for local regression models has been worked out, so it is possible to incorporate an inferential component into a loess analysis. Doing so facilitates generalizations about the structure of the population from which the observed data were drawn. Inferential tools also enable the researcher to assess the degree of uncertainty about the precise form of the smooth curve fitted
Loess and multivariate data
Although the discussion so far has focused on bivariate scatterplot smoothing, loess can also be a useful tool for situations where a dependent variable is hypothesized to be a function of several independent variables. In fact, there are at least two different approaches that can be used: multivariate loess (or, more precisely, ‘local multiple regression’) and generalized additive models. Let us briefly consider each of these strategies.
Software for loess
Because of its computationally intensive nature, loess smoothing is effectively impossible to carry out by hand. Therefore, most potential users (at least non-programmers) are constrained by the options that are provided by the available software. Fortunately, loess fitting is now widely incorporated into statistical software packages. However, the exact nature, capabilities and flexibility of the routines vary markedly from one program to the next.
Some packages only provide basic scatterplot
Conclusions
Loess has recently received a great deal of attention in statistical circles, where it is recognized as one member from a broader family of procedures called nonparametric regression models (Green and Silverman, 1994, Fan and Gijbels, 1996, Fox, 1999). However, loess is far less well known among political scientists. This is unfortunate, because it provides a very flexible approach to the problem of representing structure within a dataset. Accordingly, loess fitting is a useful addition to the
Acknowledgements
I would like to thank Harold D. Clarke and Harvey Starr for their comments and suggestions on an earlier version of this paper. Special thanks go to Saundra K. Schneider; this project could not have been completed without her help.
References (44)
- et al.
Regression by local fitting: methods, properties, and computational algorithms
J. Econometrics
(1988) EViews 3.0 user's guide
(1997)SAS/IML software: usage and reference, version 6
(1990)SAS/INSIGHT user's guide, version 6
(1995)STATA reference manual, release 5
(1997)S-PLUS guide to statistical and mathematical analysis
(1995)- Beck N, Jackman S, 1997. Getting the mean right is a good thing: generalized additive models. Working Paper at the...
- et al.
Beyond linearity by default: generalized additive models
Am. J. Polit. Sci.
(1998) - et al.
The visual design and control of trellis displays
J. Comput. Graph. Stat.
(1996) - et al.
The use of brushing and rotation for data analysis
Regression diagnostics
Serpents in the sand: essays on the nonlinear nature of politics and human destiny
Graphical methods for data analysis
Robust locally weighted regression and smoothing scatterplots
J. Am. Stat. Assoc.
Visualizing data
The elements of graphing data (revised edition)
Locally weighted regression: an approach to regression analysis by local fitting
J. Am. Stat. Assoc.
Graphical perception: theory, experimentation, and application to the development of graphical methods
J. Am. Stat. Assoc.
Local regression models
An introduction to regression graphics
An introduction to the bootstrap
Statehouse democracy: public opinion and public policy in the American States
Cited by (394)
Machine learning prediction of electric flux in concrete and mix proportion optimization design
2024, Materials Today CommunicationsThe effect of oral probiotics on glycemic control of women with gestational diabetes mellitus—a multicenter, randomized, double-blind, placebo-controlled trial
2024, American Journal of Obstetrics and Gynecology MFMPublic pension policy, substitution income, and poverty reduction: Evidence from China
2023, Economic Analysis and PolicyMachine learning-based analysis of adverse events in mesh implant surgery reports
2024, Social Network Analysis and Mining
- ☆
The data used in the examples presented in this paper, along with the S-Plus routines used to produce the graphs, can be found on the Worldwide Web, at http://www.cla.sc.edu/gint/faculty/jacoby