SCALE IDENTIFICATION IN SPATIALLY EXPLICIT€¦ · Web viewSCALE IDENTIFICATION IN SPATIALLY EXPLICIT POPULATION-ENVIRONMENT MODELING Deirdre M. Mageean1, John G. Bartlett2 and Raymond

SCALE IDENTIFICATION IN SPATIALLY EXPLICIT POPULATION-ENVIRONMENT MODELING

Deirdre M. Mageean1, John G. Bartlett2 and Raymond J. O’Connor3

1 Margaret Chase Smith Center for Public Policy and Department of Resource Economics and Policy,University of Maine,Orono, ME 04473

2 Southern Global Change ProgramUSDA Forest ServiceRaleigh, NC 27606

3 Department of Wildlife EcologyUniversity of MaineOrono, ME 04469-5755

Address for Correspondence:

Dr. Deirdre MageeanMargaret Chase Smith Center for Public Policy Coburn Hall, University of MaineOrono, ME 04473(207) 581- 1644(207) 581- 1266 (fax)[email protected]

1

ABSTRACT

Human interactions with the environment are often contingent on, and constrained by, regional conditions and the development of operational methods of quantifying them has therefore proved elusive. We show from first principles that at least four structural classes of such interaction are possible. Regression tree models can identify such structuring in the influence of environmental variables on demographic responses.

We used regression trees, optimized by cross-validation, to model county-level population density in 1990 and to model relative population change over 1980-1990 in relation to climate and remotely sensed land cover variables over a 12,600 cell hexagon grid for the conterminous United States. The population density model yielded six, essentially spatially clustered, end nodes or zones. Within each zone population density was statistically associated with a unique set of environmental constraints. The model explained 59.8% of the variation in population density, mostly through elevation and a high population density flag, and remained unchanged following several post hoc model optimization tests. Constructing equivalent models for other variables relevant to population-environment relationships e.g. change in population density between 1980 and 1990, various indices of human settlement, etc., yielded models that were also functions of climate, seasonality, and certain types of land cover. Model structures for the different population-environment response variables generally approximated one particular theoretical structure, though the relative population change model conformed to a second type.

The residuals of empirical population density from the regression tree prediction within each hexagon revealed a systematic large-scale gradient across the eastern edge of the Great Plains. Within each of the six zones, on the other hand, the residuals in population densities were typically locally clustered, indicating a need for locally finer resolution models in these places. Many of these clusters of residuals coincided spatially with individual Omernik ecoregions and the amplitude of the residuals depended on which ecoregion was involved. In addition, residuals in hexagons that straddled ecoregion boundaries showed that populations there were locally higher than predicted, and conversely for interior hexagons, indicating that, even in a country as developed as the United States, population distribution was differentially centered on areas of local environmental diversity. Towns, highways, and ecoregional boundaries, typically in combination, were present where the continental scale model under-estimated population densities.

These results imply that no single scale and model structure is optimal for socio-demographic analysis over a global or continental extent. Instead, initial global models can fruitfully be regionally and locally refined in a recursive but geographically specific manner by use of hierarchical modeling techniques.

2

INTRODUCTION

Complex linked systems such as the interaction of human population distribution and the

natural environment require the synthesis and integration of data generated by processes that

operate on very different spatial scales. Moreover, these processes may be structured

hierarchically (Costanza et al. 1992, O'Neill et al. 1989). Spatially-extensive, aggregate factors

often emerge as important in analysis of such systems (Miller et al.1996, Roth et al. 1996,

Wickham and Norton 1994, Hall et al. 1995).

In hierarchical systems lower-level "noise" may become locally or regionally coordinated

to the point where it constitutes a perturbation of a higher level (Norton and Ulanowicz 1992).

In such situations the global hierarchical model has to be locally modified to account for the

changed scale at which these localized processes are operating. This may be especially true of

human-environment interactions which, in general, may be concealed because population growth

interacts with other variables in location-specific ways. Deforestation, for example, has multiple

causes, with the particular mix of causes varying from place to place, and the pattern depends on

the intersection in time and space of the combined effects of various conditions (Rudel and

Roper 1996). Similarly, many population-environment analyses fail to find population correlates

of environmental degradation studied across nations but do find marked correlations between

land use change and population when analysis is restricted to regions of similar socioeconomic

characteristics. These types of regional distinctiveness argue strongly for analysis that

recognizes and allows for contingent effects between response and predictor variables and that

can determine where local development of higher resolution models is needed (Meyer and

Turner 1994). It remains, however, possible that in other situations population-environment

interactions are not hierarchically structured in this way but rather reflect the outcome of the

simultaneous action of multiple global (in the sense of long wavelength) drivers e.g. Brown

(1995).

These diverse efforts to develop a spatially explicit understanding of population-

environment interactions (e.g., Cowen and Jensen 1998; Wood and Skole 1998; LUCC 1996)

have drawn attention to the need to conceptualize the links of population-environment research

to remotely sensed data and to develop tools to combine the disparate data so as to distinguish

among these possible structures (Geoghegan et al. 1998). Censuses, because they are

3

comprehensive, can be aggregated to various units, normally defined by political or

administrative boundaries. However, environmental influences typically work across such units,

and identification of their particular scales and spatial domains then requires comprehensive

environmental data and analysis across spatial, temporal, and hierarchical scales (Geoghegan et

al. 1998). Remotely observed land cover data can, for example, reflect both the influences of

urbanization and road development on tropical forests (Mertens and Lambin 1997) and the

converse effects of environmental determinants of population distribution and change (Rindfuss

and Stern 1998) but need to be analyzed at appropriate resolution and spatial extent. To this end

any advances in the conceptualization and quantification of the linkages between population

distribution and dynamics and environmental data are of considerable value. In the present paper

we describe how different patterns of environmental drivers on population attributes may

interact, we present an example of how such pattern can be determined statistically, and we

develop an approach to determining the scaling of the observed phenomena.

Spatial structuring in human-environment interactions

Decisions about the appropriate scale of analysis and aggregation of data are typically

driven by considerations both of theory and of data availability (Rindfuss and Stern 1998). To

understand the potential patterning of human impacts on the environment one must also

understand the patterning of environmental influences on humans. One can conceive of the

influence of environmental factors on humans being patterned within four distinct structures,

characterized by regional patterning of the effects of different variables. In the first of three

hierarchical patterns (Table 1a), each of a set of regions is dominated by the influence of a single

environmental factor considered favorable if above some threshold, and unfavorable otherwise.

Within each region all sites examined have common environmental conditions. Then the

hierarchy of regional environmental influences will take the form shown in Table 1a. First, one

region (for spatially auto-correlated data) is separated off on the basis of all of its sites satisfying

a threshold in variable A (which will be the strongest of the five variables). These sites have a

response level R1 (in population density, change in population density, wealth, housing

conditions, or whatever other response variable is being modeled in terms of the biophysical

drivers). Of the remaining sites, all of which are unfavorable in respect of the environmental

4

conditions characterized by variable A, a second block of sites is separated from the remaining

sites on the basis of variable B being favorable, and the response variable takes value R2 in this

region. Similarly, among the remaining sites (all of them now unfavorable in respect both of

variable A and of variable B), a third variable may be favorable, leading to response R3. This

process continues until eventually no further variable is available to discriminate among the

remaining locations. One might expect that the levels of response R1, R2, R3, etc., should form

a monotonically declining sequence but with empirical data it will be possible for reversals to

occur, reflecting particularly localized strength of response to a variable. This relatively simple

scenario can logically be described as a regional dominance model.

An alternative structure emerges where regions are influenced by the congruence of

multiple global factors (Table 1b). (“Global” is used here to refer to the entire spatial domain of

the sample points being considered which may, but need not, be world-wide). Suppose three

factors influence the distribution of humans and that each again acts globally in a simple binary

(favorable/unfavorable) manner. If these factors are A, B, and C, then there is a total of eight

possible combinations. If for specificity we assume that the absolute strengths of variables A, B,

and C are in that order, then the eight response levels R1 through R8 form a monotonically

declining sequence. Note that each of the eight regions is defined as the set intersection of the

values of the three variables. This scenario is logically termed the global constraint intersection

model.

The third situation is depicted by Table 1c. Here each region is controlled by virtue of

being in a set intersection of constraints but, in contrast to the situation in Table 1b where all

factors had global effects, here each factor along the hierarchy from left to right has only a

regional or local effect contingent on the value of stronger variables at higher levels (to its left)

in the hierarchy. Thus, if we view the regions of Table 1c in a human-environment context, each

region with a given response level e.g., R1 - results from the interaction between the

demographic variable and its chain of environmental constraints as specified in the columns of

the table. Note that these constraints 1) are only locally operative, and 2) may result in similar

values for the demographic variable across multiple regions. That is, the response variables

values R1 and R5 may be numerically very similar but the similar values occur by reason of

quite different combinations of environmental drivers. This scenario is logically a local

5

constraint intersection model.

The fourth type of model is not illustrated explicitly here since it reflects merely the

global influence of a predictor variable across the domain of interest and can be approximated by

piecewise regression. That is, if the response-predictor relation is perfectly linear, the

approximation is in effect the equivalent of Table1b where variables B and C are now replaced

by quartile and octile splits on variable A; if the relationship is curvilinear, analogous

approximation between irregularly spaced percentiles results.

These four models have very different implications as to how population and

environment interact - the first as a series of regionally single factor influences, the second the

varied outcomes of a small set of environmental factors interacting globally, the third as a

hierarchical structure of increasingly local effects, and the fourth a globally dominant

environmental factor. In essence these theoretical constructs provide a basis for classification of

the structures actually present in the environmental relationships of any population variables to

be considered. This type of model development and assessment, coupled with the rich

demographic and environmental databases already available, has the potential to provide critical

information in developing a theoretical framework for human-environment interactions. For

example, Kates’ (1998) argument that the interaction of global environmental change with

economic restructuring and with population growth and migration is most readily examined at a

local level would be supported by a pattern of demographic response variables consistently

conforming to the local constraint intersection model but Brown’s (1995) macro-ecological

model of distribution would lead one to expect the global constraint intersection to be dominant.

Similarly, were the systematic nesting of ever smaller scales inherent in the regional dominance

model common-place, it might well reflect universal scaling due of data aggregation, as

described by Costanza and Maxwell (1994).

It is tempting to think of the processes just described as playing out on an array of spatial

scales that are universal across the domain of interest and that are immediately recognizable as to

relevant resolution. However, as the interaction of ecological and socio-economic processes

shapes landscape pattern (Lee et al. 1992), the connections of population to land cover change

and environmental drivers become weaker at smaller scales because other variables are

influential locally. That is, it is necessary to acknowledge that the appearance of new, higher

6

resolution processes may appear only locally within the domain of discourse. For example, the

processes that shape human activity within the Willamette Valley of Oregon or the Central

Valley of California are unlikely to be the same as those shaping activity in the Cascades or in

the Sierra Nevada; and these processes are also unlikely to be simply a variant of those mountain

processes from which the sub-set linked to the high-frequency structure of the mountains is

simply absent. The nexus of processes that play out at any point on the landscape will generate a

sphere of spatial influence whose dimensions are currently unpredictable and which will vary

from place to place across the landscape. Improved ability to detect the presence of relevant

scales of human influence on landscapes will therefore be of considerable value in revealing

where such variables are to be searched for, it being impossible to gather high resolution data

with uniform success globally. Meyer and Turner (1994) use the same thinking in arguing that

better data as to the spatial congruence of population density with particular land covers will

clarify the role as a driving force of land cover change. Hence we develop in the present paper a

method of using the residuals about the class of models discussed above to identify where the

global models need to be modified in light of local and regional factors.

In the present paper we examine the distribution of human population density and of

attributes of the Bureau of the Census data for 1990 across the conterminous United States as a

model of how scale effects might be recognized and incorporated in population-environmental

analysis.

MATERIALS AND METHODS

Mageean and Bartlett (Mageean and Bartlett 1998a,b, Jones et al. 1999) used regression

tree analysis to model population density and socio-economic variables across the conterminous

United States against a series of environmental variables. The county-level population data used

were the 1990 population data from the U.S. Bureau of the Census, mapped to a regular grid

using a geographic information system (ARC/INFO - ESRI 1996). The grid used was the U.S.

Environmental Protection Agency’s Environmental Monitoring and Assessment Program

(EMAP), a hexagonal grid with approximately 12,000 spatial units within the 48 states (White et

al. 1992). Each grid cell was approximately 635 km2, with a center-to-center spacing of

approximately 25 km. Each hexagon intersected the county level data in a series of polygons

7

and a density was assigned to each hexagon as an area-weighted sum of the densities in these

polygons.

The environmental data were those used by O’Connor et al. (1996). Briefly, this data set

comprised selected climate data, land cover class data from the Loveland et al. (1991) prototype

land cover classification of the United States, extended with the addition of an urban layer from

the Digital Chart of the World (Danko 1992), various landscape metrics derived therefrom

(Hunsaker et al. 1994) and various supporting data such as elevation, stream density, road

density, and gross patterns of federal versus non-federal ownership (later formalized as Wickham

et al. 1995). In the event, only rather few variables, notably climate variables, contributed to the

predictions.

Very high population densities were a complex function of urbanization rather than of

any of the environmental variables they considered and densities of more than 100 persons/km2

were high enough to make the likelihood of any significant ecological functioning negligible

(Terborgh 1989). We therefore flagged densities above this threshold as urban densities and

used this flag as a predictor variable to handle the excessively high densities involved.

Regression tree analysis - developed by Breiman et al. (1984), and used here in the form

of the S-Plus (Stat Sci, Inc., Seattle, Washington) implementation reviewed by Clark and

Pregibon (1992), is well adapted to handle the complex and unpredictable interactions of

ecological data. Regression tree analysis recursively partitions a data set through a series of

binary nodes, at each binary node evaluating all independent variables and all threshold values

thereof for ability to dichotomize the data into two subsets that are as different as possible as to

the values of the dependent variables. The independent variable and threshold that most

distinctively dichotomizes the sample is chosen as the appropriate splitting criterion for that

node. This process is repeated separately with each of the two descendent nodes, and the process

propagated until some stopping rule is satisfied. The process is prone to over-fitting in that, in

the extreme, partitioning could proceed until each terminal node contains just one member of the

sample group (or multiple members with zero variance) and therefore be fitted exactly. This

problem is addressed by over-fitting the tree initially, then pruning it back to a smaller tree on the

basis of cross-validation.

For the present work the regression tree model thus created was used as the basis for

8

predicting population density in each of the 12,600 hexagons in the conterminous United States

and computing the residuals about those predictions for each hexagon. The resulting residuals

were exported to SYSTAT (SPSS Inc., Chicago, IL) for statistical analysis using analysis of

variance, t-test, non-parametric two-sample tests, and other conventional statistical tools. Maps

were prepared using ARCVIEW GIS (ESRI 1996). In addition to population density we

considered several other variables used by Bartlett et al. (2000), namely relative population

change over the decade, farmstead density, agricultural intensity, mean age of built structure, a

wealth index, and metropolitan status.

RESULTS

The Continental Model

Figure 1 shows the regression tree analysis for 1990 population density and for 1980-90

decadal relative change in population density. The regression tree for population density used a

logarithmic transformation of population density (persons/km2) as the dependent variable and

yielded a partitioning of the 12,600 hexagons into six end nodes (A-F in Figure 1a) by means of

a series of binary partitions of the data. The statistical fit of the model was moderately good,

accounting for 59.8 per cent of the deviance in density. The end nodes identify areas whose

environmental conditions are defined as the set intersections of the environmental constraints in

their chain of splitting conditions back to the root node, and these are mapped in Figure 2. Thus,

node A consists of those hexagons that had low population densities in cold, low elevation areas

(mean January temperature below -10.5oC). Similar interpretations can be applied to the other

end nodes. Four of the end nodes (A, B, D, and E) are groups of largely spatially contiguous

hexagons, with scattered outliers around their edges (Figure 2). Node C, on the other hand

(almost inevitably because of its definition in terms of concentrations of high population

density), was much more diffuse, capturing most of the large metropolitan centers in the country,

though Denver was absent because of its segregation by the split on elevation. Finally, node F

contained locations with both high elevation and high annual precipitation associated with the

western mountain ranges. The national pattern was thus a function of three major, large-scale,

natural factors (elevation, January temperatures, and annual precipitation). Among the many

small-scale patterning variables considered in the analysis, only the artificial urbanization index

9

appeared. The large wavelengths probably reflect spatial auto-correlation in the defining

variables of January climate, annual precipitation, and elevation.

The model for relative population change over the decade (Figure 1b) was rather

different, and notably more asymmetric than that for population density. Population change split

initially on the basis of seasonality, with a threshold of 25.2oC. In areas with high seasonality a

further split into end nodes occurred, again on the basis of seasonality. The effect was to divide

the hexagons across the conterminous United States into three groups, characterized by

seasonality of above 28.3oC, seasonality levels between 25.2 and 28.3oC, and seasonality below

25.2oC. The least seasonal hexagons then, in turn, split off groups of various hexagons on the

basis of land cover (mean area of patches of barren land) (yielding node E), then on average

January temperature (yielding node D), then on deciduous forest patch size (yielding node C),

finally dividing into nodes A and B on the basis of relief.

The population density model of Figure 1a clearly conforms to the local constraint

intersection model rather than to the global constraint intersection model or to the regional

dominance model. Seven of the eight other variables subjected to regression tree analysis also

conformed to this model (though specific results are not presented here). Only the 1980-90

decadal population change shown in Figure 1b differed, conforming fairly closely to the regional

dominance model, the only deviation being an additional side-branch split off on seasonality.

Identifying Mesoscale Pattern

Each of the six terminal nodes for population density in Figure 1a predicted a population

density across all locations in its spatial domain (Figure 2) but for each hexagon the true

population density was in fact known. The deviations of these known values from their

respective node predictions could simply reflect statistical noise about the within-node mean i.e.,

densities within each hexagon were randomly distributed about the mean. Alternatively, local

conditions within each domain may have caused systematic local deviation of the human

population from the value predicted across the node as a whole, yielding spatial concentrations

of deviations indicative of the need to introduce mesoscale or regional models to modify the

continental one. Figure 3 shows the distribution of hexagons that were respectively either one

standard deviation above or one standard deviation below the mean for their node. The error was

10

Comment, 01/03/-1,

A very long sentence

computed as “error = observed - predicted”, so that a positive value were obtained where

population density was under-estimated by the model. The deviations proved to be concentrated

in space. For example, in node A the concentration was primarily centered around Minneapolis-

St. Paul (Figure 3a), indicating that the influence of this metropolis extended way beyond the

area captured in the location of this city in node C. Smaller clusters of underestimates occurred

to the western edge of the node and also in Vermont. In contrast, the overestimates of population

in node A occurred in very small clusters and primarily on the western edge of the node,

excepting one cluster in Maine. Similar concentration of population around the suburbs of the

major urban areas was evident in node B for much of the eastern U.S. (Figure 3b) and for many

western cities e.g., Phoenix , San Diego, and Los Angeles. Overestimates of population were

primarily along the western half of the node, with smaller clusters in eastern Maine, Michigan,

the Florida panhandle, New Mexico, and through the Pacific states. Most of the large

metropolitan areas in node C (Figure 3c) had centers underestimated in population e.g., Boston,

New York, Philadelphia, and Washington. Node D (Figure 3d) differed from node B in having

larger blocks of especially contiguous errors of both sorts, with positive and negative clusters

distributed fairly widely across the node. A concentration of overestimates along the eastern

boundary of node D alongside the corresponding underestimates along this boundary in node B

suggest a gradient in densities from east to west that was poorly modeled by the piecewise fits of

the node constants. Node E hexagons generally underestimated the population density along the

Mexican border with western Texas (Figure 3e), probably largely reflecting the growth of the

trans-border towns in recent years (Sable 1989). In this node population density was

systematically overestimated across the deserts of southwestern Arizona and of southern

California. The remaining node (Figure 3f) captured high elevation wet hexagons in both the

eastern and western part of the U.S., with distinct regional concentrations of overestimates in the

Appalachians and in the Cascades and underestimates in eastern Idaho and western Montana.

Ecoregional Patterning

Omernik (1987) classified land in the United States into ecoregions within which

geographical ecology was similar. The analysis of Figure 1a, based as it was on climate and land

cover attributes, potentially captured some or all of this regionalization. As there are 76

11

ecoregions, the best one could hope for by way of a fit was that each of the nodes of Figure 1a

would correspond to aggregations of ecoregions rather than have node boundaries overlapping

ecoregions. We therefore also calculated for each node an “ownership” value in respect of each

ecoregion, to measure the extent to which each ecoregion was contained within a node. If a node

consisted entirely and exclusively of, say, some six ecoregions, then its ownership value of each

of these six regions would be 100%. On the other hand, if one of the overlapped ecoregions had

25% of its hexagons in another node, then the first node would own only 75% of that ecoregion.

Table 2 presents the results. A minority of ecoregions had ownership values of 100%, indicating

that most ecoregions spanned more than one node.

We considered further whether the residuals about the node means might vary from

ecoregion to ecoregion within a node and conducted an analysis of variance for each of the six

nodes (Table 3). Within all six nodes there was a strong effect in residual with ecoregion,

though weakest in node C (the metropolitan areas node). In the other five ecoregions, however,

there was strong variation from ecoregion to ecoregion in the mean value of the residual from the

continental model. Recalling that the population densities were analyzed after log-

transformation, these differences indicate proportionately large variation in actual densities

between ecoregions within nodes. Examination of the mean residuals for individual ecoregions

showed no clear pattern, in some cases with neighboring ecoregions sharing rather similar values

but in other cases with spatially distant ecoregions having similar values and neighbors having

quite different values, this within individual nodes. Thus in summary, although ecoregions were

not tightly coupled to the nodes of Figure 1a, the residuals of population density within any node

were not randomly spread across the individual ecoregions, and therefore seemed to deviate

individualistically from the global model.

Correlates of Residuals

A further ecoregional phenomenon was apparent. When all hexagons that straddled two

or more ecoregion boundaries were classed as “edge hexagons”, leaving as “interior hexagons”

those whose entire area was contained within a single ecoregion, population density residuals in

edge hexagons were significantly more positive than were those in interior hexagons (0.115

0.995 versus -0.0470.979, t = 8.32, P < 0.001). We conclude, therefore, that edge hexagons

12

typically had higher population densities than expected for the node as a whole, while at the

same time the interior hexagons within each ecoregion tended to over-predict population density.

Identifying local and regional drivers

As noted earlier, systematic clustering of residuals in population density implied that

some rather small-scale factors were working to bring about a local departure of population

density from the levels predicted by the continental model. The ecoregion boundaries

constituted one such factor, as just noted, but clearly accounted for only a small fraction of the

cases of under-estimation by the continental model. We therefore also considered as possible

correlates the extent to which towns and highways might lead to local enhancement of

population density. The presence of a town provides a social organization of humans into a

finite space which is large relative to our spatial units of analysis and a highway typically

promotes population growth along its length by virtue of the ease of travel it provides. Figure 4

shows the distribution of positive residuals with respect to the presence of ecoregions, towns, and

highways, either alone or in all possible combinations. Inspection of this map shows that the

majority of these residuals (63%; Table 4) coincided with the co-occurrence of highways and

towns; a third of these also coincided with eccoregion boundaries, notably along a chain of sites

down the eastern side of the Appalachians and in a few clusters in western states e.g., Utah,

Colorado, and Arizona. Clusters associated with highways alone were scarce (8.6%) and were

most noticeable in southern California, New Mexico, and Montana. Clusters centered only on a

town or only on an ecoregion boundary were present only as scattered instances. Finally, those

clusters that did not relate to one or more of these three variables (10.6% of the total) were

essentially confined to western and southwestern states. Thus most of the clusters of larger than

expected population densities were linked to highway and towns.

DISCUSSION

The methods described here constitute an extension of the approach developed by

Legendre and Fortin (1989) who used a regression surface to model long-wavelength, spatially

extensive patterns, followed by analysis of the distribution of residuals to identify smaller scale

patterns. The regression tree analysis used here has many similarities to other classes of

13

multivariate regression but offers the special advantages for population-environment analysis of

allowing the manifestation of contingent effects and interactions, even in circumstances where

the nature of these cannot be formulated a priori. The recursive partitioning involved allows

different rules to emerge in different places, possibly involving quite different variables in

different parts of the spatial domain. Our earlier work (Mageean and Bartlett 1998a,b, Bartlett et

al. 2000) exploited these properties to derive a continental model of the environmental

contingencies of human population density across the conterminous United States. Here, our

residuals analysis extended the approach to address the limitations of continental models and to

determine where smaller regional models needed to be introduced.

We found that the local constraints model (Table 1c) was the pattern shown by the

regression tree for population density (Figure 1) and for all the other trees constructed bar that

for population change; this last proved to parallel the regional dominance model (Table 1a).

Thus the majority of the patterns of population-environment interactions, as in previous studies

(Mageean and Bartlett 1998b, Bartlett et al. 2000), provide evidence in favor of Kates’ (1998)

generalization that it is at local level that explanations of these interactions must be sought.

What our models provide is a quantified delineation of the spatial domains of the different

environmental drivers or combinations of drivers for which global data are available, together

with the ability to identify where specifically local data not considered in the global data set are

needed to account for a regionally cohesive departure from the global predictions. Thus our

prototype promises to allow systematic, quantified evaluation of the thinking underlying Kates’

(1998) argument. It is of particular interest in this respect that it is population change that

matches the regional dominance model. A reasonable interpretation would be that an individual

driver may be strong enough on its own to drive population decreases or to stimulate increase,

but that where it has no effect another driver then has its effects manifest. This simple notion is

one that cannot be captured by conventional regression models unless they are expressly

structured a priori to model such a phenomenon. Huston (2001) and O’Connor (2001) both

argue that the modeling of distributional data will typically be misleading if such multiple

alternative constraints are not accommodated in the analysis adopted.

Regression trees assess correlation and not causation, limiting the interpretation possible

here. Nevertheless, such analyses delineate the patterns that causal explanations must account

14

for and provide pointers to the likely causal factors. In biogeography the Holdridge life zone

classification (Holdridge 1967) characterizes biomes on the basis of empirical relationships to

temperature and precipitation, an idea no different than our recognition of particular end nodes as

the set intersection of environmental constraints. Small and Cohen’s (2000) recent use of the

Gridded Population of the World database to determine the distribution of the world’s population

within the phase space of the environmental variables they considered is an explicitly human

distribution study in the same spirit. The fact of the end partitioning achieved in the present

study likewise implies that the regionalization deserves use in formulating hypotheses for further

study, particularly because of the extensive checking against collinearity within our suite of

potential predictor variables. It remains possible, as with all regression studies, that the variables

identified here may be yet confounded with other factors not considered as candidate predictors

but, if so, their action is necessarily limited to the spatial domain of that end node within our

regression tree. Thus our analysis limits the scope for potential misinterpretation and provides us

with “experimental” and “control” domains within which any potential confounder must display

appropriate effects.

That none of the many landscape metrics we considered appeared as predictors of

population density runs contrary to the widespread advocacy of their value in the recognition of

patterns within landscapes (Turner 1990, O'Neill et al. 1986), landscape pattern being a mixture

of natural and anthropogenic patches of varying size and configuration driven by the joint action

of physical, biological, and social forces [Burgess and Sharpe 1981, Forman and Godron 1986,

Krummel et al. 1987, Turner 1987, Turner 1990). Our residuals analysis, rather than landscape

metrics, was needed to detect regional patterns. One possibility is that high spatial variability in

population density may have prevented the regression tree analysis from detecting anything other

than long wavelength effects: the cross-validation process used is notoriously data hungry.

However, an alternative explanation may be that the linkage of landscape metrics patterns to

ecological processes breaks down under intense human use of the land. Hulshoff (1995) found

that few landscape indices were useful in the intensively managed landscapes of the Netherlands:

the dominance index was insensitive to changes in landscape, shape indices of natural areas were

dominated by the presence of their human-modified surroundings, and no index tracked

locational changes in patches. These findings suggest that it might be unwise to rely on the

15

spatial patterning of landscape metrics as definitive in scale recognition with demographic and

socio-economic data, and that residuals analysis of the type demonstrated here may be a more

robust approach to detecting the need for regional scale changes. The promise of earlier

exploratory work towards the use of landscape metrics in detecting varied scale patterns, notably

Wickham and Norton’s (1994) system of mapping units called Landscape Pattern Types (LPT)

and Flamm and Turner’s (1994) landscape-condition labels, may play out over smaller extents

mapped with higher resolution spatial data. Flamm and Turner’s simulations showed that

considerable differences arise in the dynamics of models using pixel-based landscape metrics

than in the dynamics of models in which discrete landscape patches are treated as units. Such

results suggest that in the study of population-environment interactions it is essential to identify

where patterns in simple landscape metrics come together to form synthetic wholes before

conducting analyses of the influence of landscape composition and pattern on human

distribution. In effect, these studies and the present one support Norton and Ulanowicz’s (1992)

position as to local perturbation of spatially extensive models and meet Meyer and Turner’s

(1994) need for a middle scale between global (here continental) and local within which to

address population-environment relationships.

Overall we detected three patterns or levels of deviation superimposed on the regression

tree continental model. The first was the systematic change in sign of deviations across node

boundaries from east to west, indicating that the overall population density surface is merely

being approximated by the planes fitted within each node. The spatial contiguity within the

nodes reflects the parallel auto-correlation of the environmental variables involved (Figure 3) but

did not at the same time capture the gradients therein cf. the steep descent in precipitation across

the eastern edge of the Great Plains. The second level at which deviations from the continental

model were detected in our analytical approach was the variation in residuals from Omernik

ecoregion to Omernik ecoregion within nodes (Table 3). These results linked the residuals

decisively to the Omernik ecoregions, indicating that the residuals about our models are in some

way associated with the characteristics and attributes that define each ecoregion as a synoptic

whole. The third level at which a pattern was detected here was in the relationship with the

boundaries of ecoregions, where we found a significant bias in most nodes toward

underestimation of the population density at the ecotone of two or more ecoregions. Since this

16

signal was manifest as a pattern in the residuals despite the ecoregion to ecoregion variation in

mean, it indicates that the ecotonal effect must be quite strong. Finally, we showed that the

occurrence of local clusters of under-estimates of population density were associated with the

presence of towns, highways, and ecoregions, the first two of which capture effects better

described as social than as biophysical. We are not suggesting that demonstrating that higher

densities of people are found along highways or in towns is a scientific break-through! Rather,

they are used here solely as a demonstration of how analysis of residuals in future work of this

type can be used to localize and investigate local or regional departures from models covering

large spatial extents. In that future work neither the localization of the domain of the regional or

local drivers nor the identity of those drivers will be known in advance. In the prototype set out

here a small number of attributes - towns, highways, ecoregion boundaries - proved correlated

with all but about 10% of the clusters across the conterminous U.S. i.e. the effects turned out to

be local effects that operated globally. However, it is in fact not necessary that this be the

outcome: it is entirely conceivable that local departures from global models be location-specific,

each requiring consideration of factors that are influential nowhere else in the domain of the

model.

The most intriguing of these three findings was the tendency for high population densities

to exceed the predicted values along the edges of ecoregions: there are more people at the

transition areas between ecoregions than in the interiors thereof. It is well known in small scale

studies that ecotones are particularly valuable for wildlife (Leopold 1933) but the observation

has not been reported, to our knowledge, of humans in relation to the much larger spatial extent

ecoregions. It may be relevant that Pysek (1992) analyzed the transition zones between human

settlements and adjacent rural areas in central Europe and discovered higher vegetation diversity

and plant species richness in settlement transition zones as compared to more urban or more rural

areas.

A key distinction between our work and previous investigations of how changes in scale

associated with change in the resolution (grain size) of a spatial analysis affect the conclusions

reached (Busing and White 1993, Benson and MacKenzie 1995, Moody and Woodcock 1995, Qi

and Wu 1996) is that they implicitly assume that a single scale (i.e., resolution) is optimal across

the domain of interest. A fundamental weakness in this approach is that it requires the

17

relationship between predictor and response variables to be identical over the whole spatial

domain of the data set at all scales. Our analyses provide an empirical example in which the

clusters at the regional and local scale do not show a regularity of pattern amenable to such

analysis, the need for finer resolution varying across space and being correlated with different

factors or combinations of factors (towns, highways, boundaries) at different locations. Global

analysis of our population density correlates at multiple levels of aggregation would have yielded

misleading results because the rules are local, not global.

The key to scale recognition in our analysis was the spatial contiguity of the hexagons

and of the residuals, which almost certainly originated in a spatial auto-correlation of the key

predictor variables, an auto-correlation which was not explicitly used in the regression tree

partitioning. Although it is well understood that statistical inference requires the members of a

sample to be mutually independent and that spatial auto-correlation lowers the effective degrees

of freedom, here our analysis can be seen as a first determination of the scale of any spatial

effects present, Borcard et al. (1992) recognize four distinct patterns of spatial variation: a

statistical, non-spatial dependence of the response variable on the environmental variables

considered; a purely spatial auto-correlation of the dependent variable over space; a dependence

of the dependent variable on a correlated response of the environmental variable over space (i.e.,

spatial auto-correlation within the predictor variable’s distribution); and a residual non-spatial

noise component. The value of a spatial regression tree analysis in any given case would then

depend on the relative magnitude of these four components.

With growing appreciation of the role of spatially explicit processes in shaping the

natural environment and the interaction of human populations with them, our work provides a

systematic approach to delineating the spatial domain within which the more local investigations

must take place. We have shown that ideas suggested within the natural sciences for the

application of the principles of landscape ecology to environmental issues (Legendre and Fortin

1989, Flamm and Turner 1994) can be developed to be viable identifiers of appropriate scales,

and in particular of regionally restricted scale changes, for the characterization of the spatial

scale of human activities and their environmental implications. Our combination of regression

tree analysis to define spatially extensive interaction of demography and environment, combined

with residuals analysis to identify the location and extent of departures from such models, offer a

18

Comment, 01/03/-1,

The entire yellow text could alternatively be moved up to open this paragraph i.e., precede “The key ...”

systematic approach to achieving such identification.

LITERATURE CITED

Bartlett, J.G., D.M. Mageean, and R.J. O'Connor. 2000. Residential expansion as a continental threat to U.S. coastal ecosystems. Population and Environment 21: 429-468.

Benson, B.J., and M.D. MacKenzie. 1995. Effects of sensor spatial resolution on landscape structure parameters. Landscape Ecology 10: 113-120.

Borcard, D., P. Legendre, and P. Drapeau. 1992. Partialling out the spatial component of ecological variation. Ecology 73: 1045-1055.

Breiman, L., J.H. Friedman, R.A. Olshen, and C.J. Stone. 1984. Classification and regression trees. Wadsworth, Belmont CA.

Brown, J.H. 1995. Macroecology. University of Chicago Press, Chicago, IL.

Burgess, R.L., and D.M. Sharpe, (Eds). 1981. Forest Island Dynamics in Man-Dominated Landscapes. Springer-Verlag, New York.

Busing, R.T., and P.S. White. 1993. Effects of area on old-growth forest attributes: implications for the equilibrium landscape concept. Landscape Ecology 8:119-126.

Clark, L. A. and D. Pregibon 1992. Tree-based models. Pages 377-419 In: J.M. Chambers and T.J. Hastie, (Eds.) Statistical models in S. Wadsworth & Brooks/Cole Advanced Books & Software, Pacific Grove, California.

Costanza, R., L. Wainger, C. Folke, and K.G. Mäler. 1992. Modelling complex ecological economic systems: Towards an evolutionary, dynamic understanding of humans and nature. Beijer International Institute of Ecological Economics Reprint Series Number 17.

Costanza, R., and T. Maxwell 1994. Resolution and predictability: an approach to the scaling problem. Landscape Ecology 9:47-57.

Cowen, D. J., and J. R. Jensen. 1998. Extraction and modelling of urban attributes using remote sensing technology, Pages 164-188 In: D.L. Liverman, E.F. Moran, R.R. Rindfuss, and P.C. Stern (Eds.) People and pixels: linking remote sensing and social science. Washington D.C.: National Academy Press.

Danko, D.M. 1992. The digital chart of the world. GeoInfo Systems 2:29-36.

ESRI - ARC/INFO Version 7.0.4. 1996. Environmental Systems Research Institute, Inc. Redlands, CA.

Flamm, R.G., and M.G. Turner. 1994. Alternative model formulations for a stochastic simulation of landscape change. Landscape Ecology 9:37-46.

Forman, R.T.T., and M. Godron. 1986. Landscape Ecology. New York: John Wiley and Sons.

Geoghegan, J., L. Pritchard, Y. Ogneva-Himmelberger, R.R. Chowdhury, S. Sanderson, and B.L. Turner II. 1998. "Socializing the pixel" and "pixelizing the social" in land-use and land-

19

cover change, Pages 51-69, In: D.L. Liverman, E.F. Moran, R.R. Rindfuss, and P.C. Stern (Eds.), People and pixels: linking remote sensing and social science. Washington D.C.: National Academy Press.

Hall, C.A.S., Hanqin Tian, Ye Qi, G. Pontius, J. Cornell, and J. Uhlig. 1995. Spatially explicit models of land-use change and their application to the tropics. DOE Research Summary No. 31, February 1995.

Holdridge, L.R. 1967. Life Zone Ecology. Tropical Science Center, San Jose, Costa Rica.

Hulshoff, R.M. 1995. Landscape indices describing a Dutch landscape. Landscape Ecology 10:101-111.

Hunsaker, C.T., R.V. O’Neill, S.P. Timmins, B.L. Jackson, D.A. Levine, and D.J. Norton 1994. Sampling to characterize landscape pattern. Landscape Ecology 9:207-226.

Huston, M.A. 2001. Ecological context for predicting occurrences. In: J.M. Scott, P.J. Heglund, J.B. Haufler, M.L. Morrison, M.G. Raphael, W.B. Wall, and F. Samson, (Eds.) Predicting Species Occurrences: Issues of Accuracy and Scale. Island Press.

Jones, M.T., J.G. Bartlett, and D.M. Mageean. 1998. Visualising the hierarchical organization of a spatially explicit socioeconomic system. Systems Research and Information Systems 8:137-149.

Kates, R.W. 1998. Expanding our directions. Land Use and Land Cover Change Newsletter (Special Issue: The Earth’s Changing Land Conference), Number 3, March 1998). Barcelona, Spain: Institut Cartogràfic de Catalunya.

Lee, R.G., R.O. Flamm, M.G. Turner, C. Bledsoe, P. Chandler, C. DeFarrare, R. Gottfried, R.J. Naiman, N. Schumaker, and D. Wear. 1992. Integrating sustainable development and environmental quality: a landscape ecology approach. Pages 499-521 In: R.J. Naiman (Ed.) New Perspectives in Watershed Management. Springer Verlag: New York.

Legendre, P., and M.-J. Fortin. 1989. Spatial pattern and ecological analysis. Vegetatio 80:107-138.

Leopold, A. 1933. Game Management. Charles Scribner’s Sons: New York.

Loveland, T.R., J.W. Merchant, D.J. Ohlen, and J.F. Brown. 1991. Development of a land-cover characteristics database for the conterminous U.S. Photogr. Engin. Rem. Sens. 57:1453-1463.

LUCC. 1996. Land use and cover change: Open Science Meeting Proceedings. L. Fresco, R. Leemans, B.L. Turner ll, D. Skole, A.G. van Zeijl-Rozema, and V. Haarmann (Eds.). Barcelona, Spain: Institut Cartogràfic de Catalunya, 1997.

Mageean, D.M. and J.G. Bartlett. 1998a. Putting people on the map: integrating social science data with environmental data. In: Pecora 13 Proceedings: Human interactions with the environment - perspectives from space. Bethesda, MD: American Society of Photogrammetry and Remote Sensing. CD-ROM, 1 disk.

20

Mageean, D.M., and J.G. Bartlett. 1998b. Using population data to address the problems of human dimensions of environmental change. Pages 193-205 In: S. Morain (Ed.), GIS in natural resource management: balancing the technical-political equation. Santa Fe, NM: High Mountain Press.

Mertens, B., and E.F. Lambin. 1997. Spatial modelling of deforestation in southern Cameroon, Applied Geography 17:143-162.

Meyer, W.B., and B.L. Turner II. 1994. Global land-use and land-cover change: an overview. Pages 3-10 In: W.B. Meyer and B.L. Turner II, Eds. Changes in land use and land cover: a global perspective. Cambridge University Press: Cambridge.

Miller, J.R., L.A. Joyce, R.L. Knight, and R.M. King. 1996. Landscape patterns and road density in the Southern Rocky Mountains. Landscape Ecology 11:115-127.

Moody, A., and C.E. Woodcock. 1995. The influence of scale and the spatial characteristics of landscapes on land-cover mapping using remote sensing. Landscape Ecology 10:363-379.

Norton, B.G., and R.E. Ulanowicz. 1992. Scale and biodiversity policy: a hierarchical approach Ambio 21:244-249.

O’Connor, R.J. 2001. The conceptual basis of species distribution modeling: time for a paradigm shift. Pages 25-33 In: Scott, J.M., P.J. Heglund, F. Samson, J. Haufler, M. Morrison, M. Raphael, and B. Wall (Eds.), Predicting Species Occurrences: issues of scale and accuracy. Island Press.

O’Connor, R.J., M.T. Jones, D. White, C. Hunsaker, T. Loveland, B. Jones, and E. Preston. 1996. Spatial partitioning of environmental correlates of avian biodiversity in the conterminous United States. Biodiversity Letters 3:97-110.

Omernik, J.M. 1987. Ecoregions of the conterminous United States. Annals of the Association of American Geographers 77:118-125.

O’Neill, R.V., C.T. Hunsaker, S.P. Timmins, B.L. Jackson, K.B. Jones, K.H. Riitters, and J.D. Wickham. 1996. Scale problems in reporting landscape pattern at the regional scale. Landscape Ecology 11:169-180.

O'Neill, R.V., A.R. Johnson, and A.W. King. 1989. A hierarchical framework for the analysis of scale. Landscape Ecology 3:193-205.

Pyšek, P. 1992. Settlement outskirts - may they be considered as ecotones? - Ekológia (ÈSFR) 11:273-286.

Qi, Y., and J. Wu. 1996. Effects of changing spatial resolution on the results of landscape pattern analysis using spatial autocorrelation indices. Landscape Ecology 11:39-49.

Rindfuss, R.R., and P.C. Stern. 1998. Linking remote sensing and social science: the need and the challenges, Pages 1-27, In: D.L. Liverman, E.F. Moran, R.R. Rindfuss, and P.C. Stern (eds.), People and pixels: linking remote sensing and social science. Washington D.C.: National Academy Press.

21

Roth, N.E., J.D. Allan, and D.L. Erikson. 1996. Landscape influences on stream biotic integrity assessed at multiple spatial scales. Landscape Ecology 11:141-156.

Rudel, T.K., and T. Roper. 1996 Regional patterns and historical trends in tropical deforestation, 1976-1990: a qualitative comparative analysis. Ambio 25:160-166.

Sable, M.H., (Ed.) 1989. Las Maquiladoras : Assembly and manufacturing plants on the United States-Mexico border : An International Guide. Haworth Press: Binghamton, New York.

Small, C., and J.E. Cohen. 2000. Physiography, climate and the global distribution of human population. Available at: http://sedac.ciesin.org/plue/gpw/index.html?workshop.html&2

S-PLUS Version 3.3. 1995. StatSci, a division of MathSoft, Inc., Seattle, WA.

Terborgh, J. 1989. Where have all the birds gone? : Essays on the biology and conservation of birds that migrate to the American tropics. Princeton University Press, Princeton, New Jersey.

Turner, M.G. 1987. Spatial simulation of landscape changes in Georgia: a comparison of three transition models. Landscape Ecology 1:29-36.

Turner, M.G. 1990. Spatial and temporal analysis of landscape patterns. Landscape Ecology 4:21-30.

White, D., J. Kimmerling, and W.S. Overton. 1992. Cartographic and geometric components of a global design for environmental monitoring. Cartographic and Geographic Information Systems 19:5-22.

Wickham, J.D., and D.J. Norton. 1994. Mapping and analyzing landscape patterns. Landscape Ecology 9:7-23.

Wickham, J.D., J. Wu, and D.F. Bradford. 1995. Stressor data sets for studying species diversity at large spatial scales. US EPA 600/R-95/018. Office of Research and Development. U.S. Environmental Protection Agency, Washington, DC.

Wood, C.H. and D. Skole. 1998. Linking satellite, census, and survey data to study deforestation in the Brazilian Amazon, Pages 70-93, In: D.L. Liverman, E.F. Moran, R.R. Rindfuss, and P.C. Stern (Eds.), People and pixels: linking remote sensing and social science. Washington D.C.: National Academy Press.

22

http://sedac.ciesin.org/plue/gpw/index.html?workshop.html&2

Table 1a. Sequentially nested population-environment interaction (the regional dominance model) in which favorable conditions with each successive factor is influential only if all factors higher in the sequence are unfavorable.

Variable A Variable B Variable C Variable D Variable E Response level

Favorable Indifferent R1

Unfavorable Favorable Indifferent R2



Unfavorable Favorable R5

Unfavorable R6

Table 1b. Hierarchically structured population-environment interaction (the global constraint intersection model) where each environmental factor has influence over the whole spatial domain, and where R1>R2>R3>... >R8 if the influence of factor A is greater than that of factor B which in turn is greater than that of factor C.

Variable A Variable B Variable C Response level

Favorable Favorable Favorable R1

Favorable Favorable Unfavorable R2

Favorable Unfavorable Favorable R3

Favorable Unfavorable Unfavorable R4

Unfavorable Favorable Favorable R5

Unfavorable Favorable Unfavorable R6

Unfavorable Unfavorable Favorable R7

Unfavorable Unfavorable Unfavorable R8

23

Table 1c. Hierarchically structured population-environment interaction (the local constraint intersection model) where the influence of local factors is contingent on conditions higher in the hierarchy.

Variable A Variable B Variable C Variable D Variable E Variable F Variable G Response level

Favorable Favorable Favorable R1

Favorable Favorable Unfavorable R2

Favorable Unfavorable Favorable R3

Favorable Unfavorable Unfavorable R4

Unfavorable Favorable Favorable R5

Unfavorable Favorable Unfavorable R6

Unfavorable Unfavorable Favorable R7

Unfavorable Unfavorable Unfavorable R8

24

Table 2. Percent ownership of Omernik ecoregions by six regression tree end nodes forresponse variable: population density in 1990 and its environmental predictors (Figure 1a). An ownership value of 100 indicates that the entire ecoregion was contained within a single end node. Only ecoregions with >50% ownership by a given end node are listed.

25

Ecoregion Node A Ecoregion Node B Ecoregion Node C Ecoregion Node D Ecoregion Node E Ecoregion Node F

48 100 28 100 64 79 12 100 24 95 4 8749 100 36 100 59 66 41 100 14 61 5 7651 74 37 100 76 50 42 100 30 56 62 6550 55 39 99 43 100 66 65

35 98 44 100 69 6433 98 45 10031 97 18 10040 96 16 9973 95 20 9974 94 13 9971 92 21 9372 92 19 9265 91 9 9070 90 22 9054 89 11 9029 86 17 8063 85 23 7434 84 26 683 83 25 67

55 82 10 6775 81 15 6057 81 46 5532 81 27 547 80

68 8053 7447 7156 7061 6538 602 56

67 561 55

Table 3. Analysis of variance results for standardized population density residuals (z-scores). Residuals about the predicted mean for each regression tree node (see Figure 1a) were transformed to z-scores and variation assessed across Omernik (1987) ecoregions within that node.

Node # ecoregions wholly or partly in the end node

F-ratio Significance Level

A 8 77.47 0.001

B 32 48.91 0.001

C 29 3.74 0.001

D 25 39.31 0.001

E 11 78.73 0.001

F 15 70.51 0.001

Table 4. Frequency of occurrence of highways, towns, and ecoregion boundaries in hexagons with higher population density than predicted by the regression tree model of Figure 1a.

Highways

(n = 1494)

Towns

(n = 1410)

Ecoregion Boundaries

(n = 673)

Number of hexagons with this combination

Per cent of Total

(total n = 1982)

Present Present Present 416 21.0

Present Present Absent 828 41.8

Present Absent Present 80 4.0

Present Absent Absent 170 8.6

Absent Present Present 66 3.3

Absent Present Absent 100 5.0

Absent Absent Present 111 5.6

Absent Absent Absent 211 10.6

26

Figure 1 a. Regression tree for population density in 1990 and its environmental correlates across the conterminous United States. b. Regression tree for relative change in population density 1980-1990 and its environmental correlates. Numbers inside the oval or rectangle are mean values for the response variable (pop. density/km2 or change in pop. density) across all hexagons associated with that branch (oval) or end node (rectangle). The environmental variable that best explains the variation in response variable for each recursive hexagon subgroup is shown at each branch and its splitting value is given.

27

A B

C

D E

F

Avg. January Temp.<-10.5oC

Avg. January Temp.>-10.5oC

Avg. January Temp.<1.8oC

Avg. January Temp.>1.8oC

Avg. Precipitation<922.8mm

Avg. Precipitation>922.8mm

High Density?YES

High Density?NO

Mean Elevation<454 m

Mean Elevation>454 m

35.50

9.4260.60

8.42 19.89327.76

5.00 22.3226.069.73

24.05

Population Density 1990

Figure 1 b.

28

Avg. Seasonality < 25.2oCAvg. Seasonality > 25.2oC

Avg. Patch SizeBarren Land

> 1.7km2

Avg. Patch SizeBarren Land

< 1.7km2 Avg. Seasonality< 28.3oC

Avg. Seasonality> 28.3oC

Avg. January Temp.< 12.3oC

Avg. January Temp.> 12.3oC

Avg. Patch SizeDeciduous Forest

< 0.5km2

Elevation Range < 484m

Elevation Range > 484m

A B

C

D

E F G

7.73

-1.0214.03

38.3111.61

10.59 37.93

5.04

18.619.72

13.89

2.62 -4.57

Change in Population Density 1980-1990

Avg. Patch SizeDeciduous Forest

> 0.5km2

Figure 2. The regionalization of environmental correlates for population density in 1990 modeled by the regression tree analysis. All hexagons of one color form a common end node (labeled on Figure 1a) and thus share the same combination of predictor variables.

29

End NodeABCDEF

Population Density / km2 (1990)

Figure 3. Standard error scores (z-scores) [(observed-mean)/standard deviation] associated with regression tree end nodes for population density in 1990 and its environmental correlates across the conterminous United States. Light gray hexagons are within 1 standard deviation of the mean for that node while medium gray hexagons are greater than 1 standard deviation below the mean (over-fit areas) and dark gray hexagons are greater than 1 standard deviation above the mean (under-fit areas) for that end node.

30

31

Figure 4. The distribution of positive residuals with respect to the presence of highways, towns, and ecoregions, either alone or in all possible combinations.

32

Ushexes roads6.shp000001010011100101110111

Us_states.lamb

All Factors Combined(highways, towns, ecoregions – considered simultaneously)

KEY:0 = not present1 = presentExamples –011 = towns and ecoregions (but not highways) were important. 111 = all three were important000 = none were important

Hig

hway

sTo

wns

E

core

gion

s

Documents

SCALE IDENTIFICATION IN SPATIALLY EXPLICIT€¦ · Web viewSCALE IDENTIFICATION IN SPATIALLY EXPLICIT POPULATION-ENVIRONMENT MODELING Deirdre M. Mageean1, John G. Bartlett2 and Raymond