14
Dasymetric Mapping and Areal Interpolation: Implementation and Evaluation Cory L. Eicher and Cynthia A. Brewer ABSTRACT: Dasymetric maps display statistical data in meaningful spatial zones. Such maps can be preferable to choropleth maps that show data by enumeration zones, because dasymetric zones more accurately represent underlying data distributions. Though dasymetric mapping has existed for well over a century, the methods for producing these maps have not been thoroughly examined. In contrast, research on areal interpolation has been more thorough and has examined methods of transferring data from one set of map zones to another, an issue that is applicable to dasymetric mapping. Inspired by this work, we tested five dasymetric mapping methods, including methods derived from work on areal interpolation. Dasymetric maps of six socio-economic variables were produced fm a study area of 159 counties in the eastern U.S. using county choropleth data and ancillary land-use data. Both polygonal (vector) and grid (raster) dasymetric methods were tested. We evaluated map accuracy using both statistical analyses and visual presentations of error. A repeated-measures analysis of variance showed that the traditional limiting variable method had significantly lower error than the other four methods. In addition, polygon methods had lower error than their grid-based counterparts, though the difference was not statistically significant. Error maps largely supported the conclusions from the statistical analysis, while also presenting patterns of error that were not obvious from the statistics. KEYWORDS: Dasymetric mapping, areal interpolation, mapping census data, map error Introduction N asymetric map depicts quantitative areal data using boundaries that divide the apped area into zones of relative homo- geneity with the purpose of best portraying the underlying statistical surface. The dasymetric map was conceived as a type of thematic map during the early to mid nineteenth century, the forma- tive years of modern thematic cartography. During their early development, the demand for both dasymetric and choropleth maps was driven by interest in population mapping (McCleary 1969; 1984). By 1900, dasymetric and choropleth map- ping methods became more clearly differentiated, with the latter becoming overwhelmingly popular in modern cartography and for general use out- side the discipline. In contrast, dasymetric map- ping has remained relatively unknown even to most geographers. Consistent with their original purpose, dasymetric maps of population are still the most common type found today. Although dasymetric maps are closelyrelated to choropleth maps, they differ in several ways. First, Cory Eicher is a Sotware Development Cartography Specialist at ESRllnc., 380 New York Street, Redlands, CA 92373, USA. Cindy Brewer is an Associate Professor of Geography at The Pennsyl- vania State University, 302 Walker Building, University Park, PA 16802, USA. E-mail: [email protected]; [email protected]. zonal boundaries on dasymetric maps are based on sharp changes in the statistical surface being mapped, while zonal boundaries on choropleth maps demarcate enumeration units established for more general purposes (e.g., states within the U.S.). The cartographer generates dasymetric zones by using ancillary information. This information can be both objective and subjective, depending on other available data and the cartographer's knowl- edge of the area. Second, individual dasymetric zones are developed to be internally homogeneous. In contrast, choropleth zones are not defined based on the data and, thus, have varying levels of internal homogeneity. Third, choropleth mapping methods have become standardized (including the development of common classification schemes; Slocum 1999), but the wide range of dasymetric procedures have been under researched. Surprisingly little literature exists on dasymet- ric mapping. At a theoretical level, MacEachren (1994) placed dasymetric maps in the continuum between isopleth and choropleth maps, suggest- ing that dasymetric maps represent data half way between smooth and stepped statistical surfaces. In preparation for this research, we studied the two most frequently cited early pieces by Wright (1936) and McCleary (1969). More recent implementa- tions of dasymetric mapping methods within GIS include Gerth (1993), Holloway et al. (1996), and Charpentier (1997). Their work relies heavily on Cartography and GeograPhic Information Science, l1Jl. 28, No.2, 2001, pp.125-138

50;93. *-88361 -6/ 90-4 (6;09874-;376! (5840506;-;376 … lit...%-:?50;93. *-88361 -6/ "90-4 (6;09874-;376! (5840506;-;376 -6/ &=-4

  • Upload
    ledung

  • View
    241

  • Download
    1

Embed Size (px)

Citation preview

Page 1: 50;93. *-88361 -6/ 90-4 (6;09874-;376! (5840506;-;376 … lit...%-:?50;93. *-88361 -6/ "90-4 (6;09874-;376! (5840506;-;376 -6/ &=-4

Dasymetric Mapping and Areal InterpolationImplementation and Evaluation

Cory L Eicher and Cynthia A Brewer

ABSTRACT Dasymetric maps display statistical data in meaningful spatial zones Such maps can bepreferable to choropleth maps that show data by enumeration zones because dasymetric zones moreaccurately represent underlying data distributions Though dasymetric mapping has existed for wellover a century the methods for producing these maps have not been thoroughly examined In contrastresearch on areal interpolation has been more thorough and has examined methods of transferringdata from one set of map zones to another an issue that is applicable to dasymetric mapping Inspiredby this work we tested five dasymetric mapping methods including methods derived from work onareal interpolation Dasymetric maps of six socio-economic variables were produced fm a study areaof 159 counties in the eastern US using county choropleth data and ancillary land-use data Bothpolygonal (vector) and grid (raster) dasymetric methods were tested We evaluated map accuracy usingboth statistical analyses and visual presentations of error A repeated-measures analysis of variance showedthat the traditional limiting variable method had significantly lower error than the other four methodsIn addition polygon methods had lower error than their grid-based counterparts though the differencewas not statistically significant Error maps largely supported the conclusions from the statistical analysiswhile also presenting patterns of error that were not obvious from the statistics

KEYWORDS Dasymetric mapping areal interpolation mapping census data map error

Introduction

Nasymetric map depicts quantitative arealdata using boundaries that divide the

apped area into zones of relative homo-geneity with the purpose of best portraying theunderlying statistical surface The dasymetric mapwas conceived as a type of thematic map duringthe early to mid nineteenth century the forma-tive years of modern thematic cartography Duringtheir early development the demand for bothdasymetric and choropleth maps was driven byinterest in population mapping (McCleary 19691984) By 1900 dasymetric and choropleth map-ping methods became more clearly differentiatedwith the latter becoming overwhelmingly popularin modern cartography and for general use out-side the discipline In contrast dasymetric map-ping has remained relatively unknown even tomost geographers Consistent with their originalpurpose dasymetric maps of population are stillthe most common type found today

Although dasymetric maps are closely related tochoropleth maps they differ in several ways First

Cory Eicher is a Sotware Development Cartography Specialist atESRllnc 380 New York Street Redlands CA 92373 USA CindyBrewer is an Associate Professor of Geography at The Pennsyl-vania State University 302 Walker Building University Park PA16802 USA E-mail ceicheresricom cab38psuedu

zonal boundaries on dasymetric maps are basedon sharp changes in the statistical surface beingmapped while zonal boundaries on choroplethmaps demarcate enumeration units established formore general purposes (eg states within the US)The cartographer generates dasymetric zones byusing ancillary information This information canbe both objective and subjective depending onother available data and the cartographers knowl-edge of the area Second individual dasymetriczones are developed to be internally homogeneousIn contrast choropleth zones are not definedbased on the data and thus have varying levels ofinternal homogeneity Third choropleth mappingmethods have become standardized (including thedevelopment of common classification schemesSlocum 1999) but the wide range of dasymetricprocedures have been under researched

Surprisingly little literature exists on dasymet-ric mapping At a theoretical level MacEachren(1994) placed dasymetric maps in the continuumbetween isopleth and choropleth maps suggest-ing that dasymetric maps represent data half waybetween smooth and stepped statistical surfacesIn preparation for this research we studied the twomost frequently cited early pieces by Wright (1936)and McCleary (1969) More recent implementa-tions of dasymetric mapping methods within GISinclude Gerth (1993) Holloway et al (1996) andCharpentier (1997) Their work relies heavily on

Cartography and GeograPhic Information Science l1Jl28 No2 2001 pp125-138

Wright and McCleary and offers insight into thepractical challenges of producing dasymetric mapsusing digital data

The use of geographic information systems(GIS) for modern mapping has renewed interestin dasymetric mapping (eg DeMers 1997 Chris-man 1997) The lack of standardization in produc-tion methods is however an obstacle that preventswidespread use of dasymetric maps in GIS Addi-tionally most production methods have not beenevaluated through systematic testing (Robinsonet aI 1995) Recent research on areal interpo-lation (sometimes referred to as cross-area esti-mation) offers a possible solution to these prob-lems Though areal interpolation research hasnot focused on map production it can advancethe study of dasymetric mapping because it hasinvolved structured evaluation of methods that arerelated in their calculation approaches

Areal interpolation is the process of transfer-ring spatial data from one set of units to another(Bloom et aI 1996 Fisher and Langford 1996)and is often used to compare multiple datasetseach collected using different enumeration unitsFor example US socio-economic and populationdata are reported by census tract whereas market-ing data may be available by zip code and pollu-tion data may be collected using watersheds as enu-meration units Areal interpolation may be used totransfer all data to a common set of enumerationunits (eg watersheds) to permit efficient compari-son and analysis

Many areal interpolation methods can be incor-porated into dasymetric mapping methods toimprove the detail of a choropleth map below thelevel of the enumeration unit (Fisher and Lang-ford 1996) The primary difference between thetwo approaches is that dasymetric mapping omitsthe final step in areal interpolation of re-aggre-gating to a preferred enumeration unit type (seeEicher (1999) for a review of areal interpolationmethods) A common goal of most work on arealinterpolation has been evaluation of accuracy ofinterpolation methods This goal contrasts withthe largely demonstrative goals of cartographicwork on dasymetric mapping

One of our goals was to build connectionsbetween areal interpolation and dasymetric researchThe technique we selected from the areal inter-polation literature the binary method was oneof the most accurate in previous testing Thebinary method was also chosen because only minorchanges would be needed to implement it as thecore of a dasymetric mapping procedure Simpleareal weighting is another common areal interpola-

126

tion method (Goodchild and Lam 1980Flowerdewand Green 1989 Goodchild et aI 1993 Fisher andLangford 1996 Cockings et al 1997) Simple arealweighting could not however be applied to pro-duce dasymetric maps (at least within the frame-work of this experiment) because it does not useancillary variables We also do not report on theapplication of regression methods of areal interpo-lation These methods are discussed in many ofthe papers cited as well as in Moxey and Allanson(I 994) Flowerdewet aI (1991) and other work byFlowerdewand Green (1991 1992 1994) Regres-sion methods use global information for the entirestudy area to estimate zone values and are rela-tively complex when compared to techniques thathave traditionally been used to create dasymetricmaps in production cartography

Dasymetric mapping and the dasymetricmethod have become ambiguous terms with theinception of areal interpolation research In thatliterature the dasymetric method describes a spe-cificform of areal interpolation that wewill refer toas the grid binary method (Langford et al 199IFisher and Langford 1996 Cockings et aI 1997)In contrast cartographers use the term dasymet-ric to refer to a general type of thematic mapproduced using a variety of methods Adding fur-ther confusion to the issue the term dasymetriccould be used to describe most forms of areal inter-polation especially intelligent areal interpolationmethods These techniques build from Wrights(1936) idea of reassigning data between units andthey incorporate his approach of using additionalinformation to perform this operation Chrisman(1998) suggests that Wrights technique was actu-ally a form of areal interpolation

MethodsFivecriteria guided dasymetric map production forthe experimental evaluation we undertook Thefirst twocriteria are basic requirements of dasymet-ric mapping that distinguish it from choroplethand isopleth mappingbull Zone homogeneity Final dasymetric map

zones were internally homogeneousbull Abrupt boundaries Zone boundaries repre-

sented escarpments or marked changes inthe mapped variable

bull Disaggregation of source zones Some arealdisaggregation of the original or sourcezones was involved (counties served as thesource zones) Source zones were not smallenough to allow for the creation of a dasy-metric map merely through their re-aggre-gation

CartogmPhy and GeogmPhic Information Science

Figure 1 Dasymetric map zones that result from overlayof county and land-use boundaries in the study area Stateboundaries and example major cities are included forreference

bull Intelligent interpolation An areal interpolationwas performed transforming socio-economicdata from one set of zones (source counties) tonew map zones defined by the intersection ofcounties with land-use polygons This interpola-tion was intelligent because it was guided byuse of ancillary land-use data in contrast to theuse of simple areal weighting

bull Nested data The experiment was performed ata scale that allowed checking of mapped values(derived from county data) against enumeratedvalues (summed from block group data) Useof these nested county and block group dataallowed statistical and visual evaluations of theresulting dasymetric maps and errorsThe experiment wasstructured to answer the follow-

ing three questionsA Are there significant differences in the accuracy of

dasymetric maps produced using the five methodstested in the experiment

B Are there significant differences in the accuracy ofpolygonal dasymetric maps versus those based ongridded data

C On a more qualitative levelwhat is the visualcharac-ter of the dasymetric maps produced for the experi-ment and what are the patterns of error on thesemaps

Vol 28 No2

In the following sections we first describe thestudy area and explain our use ofthe land-use data-set in the mapping process Secondly we describesix census variables from which we made the testset of dasymetric maps Then we outline the gen-eral mapping procedure and explain the five dasy-metric mapping methods tested We tested meth-ods derived from traditional cartographic work aswell as an areal interpolation method not specifi-cally designed to be applied in a cartographic con-text Finally we describe our methods for calculat-ing map errors that were used in the analysis of thetest maps

Study AreaThe study area encompassed 159 counties inparts of four states-Pennsylvania West VirginiaMaryland Virginia-and the District of ColumbiaFigure 1 shows a map of the study area with aselection of major cities and states labeled A largenumber of counties were included to allow forparametric analysis of the results using countiesas observations In Virginia cities are treated asdistinct municipal units and so these smaller poly-gons were included as individual observations inthe analysis The varied character of the 159 coun-ties provided a rigorous demonstration of themethods The chosen study area was also appro-priate because it contained a variety of land-useclasses and spanned a number of major landformregions including the coastal plain ridge-and-val-ley province and Allegheny plateau We expecteddifferent population patterns among these regionsFinally for each mapped variable a wide rangeof values were present within the study area Forexample population density varied widely withinthe study area from high-density urban areassuch as Washington and Baltimore to low-densityrural areas in West Virginia and north-centralPennsylvania Similar variations can be foundfor black population and housing-value variables(Figure 2)

Though it would have been convenient to mapa combination of full states the study areas rect-angular shape lends itself well to compact displayThe study area was also designed to avoid theextremely large metropolitan areas of New York(which might overly skew error summaries) whilestill including major metropolitan areas such asWashington Baltimore Pittsburgh and Richmond(Figure 1) Also the Chesapeake Bay shoreline wasmostly removed to avoid difficulties in overlayingislands and narrow peninsular polygons Lastlypersonal familiarity with the study area allowed usto visually check results for realistic values which

127

15 ID 175

2106

375103424

17510 315

30 10 2231

1410 30

0102

b

251075

a

Total Population(dens ly per km)

~ 01D25

610 14

BI ek Populationdllnslly per 11m

was important in implementing several of the map-ping methods

A necessary element for creating dasymetric mapsfrom choropleth maps is an ancillary dataset Thisinformation is used to assist interpolation of datafrom the original zones (counties) to new mapzones For this experiment land-use data wereused in both polygon and grid form Monmonierand Schnell (1984) also used land-use data toproduce a dasymetric map Our land-use datasetwas obtained from the US Geological Survey inthe form of a digital polygon covering the entirecountry at 17500000 scale These data by Hitt(1991) were created from a map of major landuses (Anderson 1967) included in the NationalAtlas (US Geological Survey 1970) and were com-pared to more recent state-level land-use maps(USGeological Survey 1993) We clipped the datafor the study area using ArcViews GeoprocessingExtension to create approximately 330 polygonsThe grid dataset was created from the polygondataset using ArcViews Convert to Grid opera-tion A grid cell of lkm2 was chosen because we feltit was suitably sized larger than most census blockgroups (used for checking the maps) and smallerthan the dasymetric zones (shown in Figure 1)

Land-use Dataset

Mapped VariablesDasymetric mapping is most commonly applied topopulation density For this experiment six socio-economic variables reflecting different aspects ofpopulation distribution were mapped All datawere obtained from the 1990 US Census Twopopulation variables were chosen total popula-tion and black population Four housing variableswere also mapped counts of housing units valuedat less than $25000 $40-49000 $50-59000and $60-75000 These six-count variables weremapped as densities of people or housing unitsAltogether 30 dasymetric maps were tested sixmaps for each of five methods Our goal was tomake conclusions about the accuracy of the meth-ods using a sample of mapped variables We feltthat testing with a sample would produce strongerresults than relying on differences with a singlevariable which was usual in previous research

The six variables were selected to obtain aseries of original maps that had a variety of spa-tial patterns Figure 2 shows pattern differences oncounty choropleth maps Socio-economic variablesare generally highly correlated and our variablesare no exception Thus there are shared character-

cNumber of HomosValued atl bullthan $25000(deNlty per km)

00210 017

0171D 0 30

031 ID 055

0581D 222

224 ID 4249

Figure 2 Example choropleth county maps of three variablesused in dasymetric map evaluation

isticsin these map patterns Weattempted to selectpatterns that were distinct from the overall popula-tion pattern in at least some areas such as differ-ent clusterings of high values in the north southand through the ridge-and-valley region (runningnortheast to southwest across the study area) Thecorrelation between the density of total popula-

128 Cmtography and GeograPhic 1nformation Science

tion and density of black population was 070The correlations between total population and thefour housing variables ranged from 060 to 079while the correlations between black populationand the housing variables ranged from 027 to047 Among the four housing variables correla-tions ranged from 081 to 099

We selected data that were available for coun-ties and block groups-two nested scales of censusenumeration In the experiment dasymetric mapswere created using the county-level data for themapped variables Data available at the block-group level were then used to check the resultsof the dasymetric mapping methods Over 13000block groups were contained in the study area andwere used to calculate errors for 580 dasymetricmap zones

Dasymetric Mapping MethodsThree core methods from the dasymetric and arealinterpolation literature were tested Two methodswere based on dasymetric mapping research thepolygon three-class method and the traditionallimiting variable method The polygon three-classmethod was used to produce a series of maps in anacademic production cartography lab (Hollowayetal 1996) while the traditional limiting methodhas roots in the 65-year-old heavily cited paperby Wright (1936) The grid binary method whichuses raster land-use data was selected from theareal interpolation literature

The dasymetric methods were vector basedand the areal interpolation method was grid basedTo provide a balanced experiment we tested twoother methods that were adaptations of these previ-ously existing methods a grid three-class method(complementing the polygon three-class version)and a polygon binary method (complementing thegrid binary version) They were developed for thisresearch and have not been tested in the publishedliterature Wefelt it was necessary to test both gridand polygon versions of the methods to be ableto attribute differences in performance to differ-ences in the methods without worrying that thedifferences might be amibutable to data structuresThe limitingvariable method isa polygon dasymetricmethod whichwe did not adapt for grid processing

The following sub-sections describe the fivemethods we tested We pair related grid andpolygon methods in the descriptions listing themethod that originates from the literature beforethe derivative method (eg Gl before PI and P2before G2) Methods were implemented in ArcViewusing Avenue scripting language (see the Appen-dix in Eicher (1999) for scripts)

Vol28 No2

Grid and Polygon Binary Methods (GI and PI)For the grid binary method (Gl) 100 percentof the data in each county was assigned to onlythe urban and agriculturaVwoodland cells derivedfrom the raster land-use ancillary dataset No datawere assigned to the other classes (forested water)hence the label binary The primary advantage ofthis method was its simplicity It was only necessaryto reclassifYthe land-use data into two classes onefor inhabitable areas and another for uninhabit-able areas This subjective decision was dependenton the land-use classification set as well as informa-tion known about the mapped area

The binary method was originally developedfor use as a mapping technique (Langford et a1990 1991 Langford and Unwin 1994) but even-tually was applied in areal interpolation problems(Fisher and Langford 1995 Cockings et al 1997)It represents a specialized form of the limiting vari-able method that McCleary (1969) described Allland uses except urban and agriculturalwoodlandin our implementation are limited to zero forpolygons deemed uninhabitable

The fundamental operations of the G1methodwere adapted to work with polygon (versus grid-ded) land-use data to create method Pl The poly-gon binary method also assigns 100 percent ofthe data in each county to only the urban andagriculturalwoodland areas within the county withpopulation or housing densities recalculated forthese new areas

Polygon and Grid Three-class Methods (P2 and G2)For the polygon three-class method (P2) we useda weighting scheme to assign population or hous-ing data to three different land-use classes withineach county urban agriculturalwoodland and for-ested For counties with all three land-use classespresent 70 percent of the data for each county wasassigned to urban land-use polygons 20 percent toagriculturalwoodland polygons and 10 percent toforested land Water received no data The percent-ages we selected were a subjective decision partlybased on a previous implementation by Hollowayet al (1996) who mapped population density forMissoula County Montana Working with censustracts as their choropleth units they assigned 80percent of each tract population to urban polygons10 percent to open polygons and 5 percent eachto agricultural and forested areas Our percentagebreakdown (70-20-10) was chosen because we esti-mated that a larger percentage of the mapped vari-able would be found in agriculturaVwoodland andforested areas in our study region than for theregion mapped by the Montana researchers For

129

this experiment adjustments in the weighting per-centages were not made for different mapped vari-ables the same scheme was used for all variables

A major weakness of the three-class method isthat it does not account for the area of each partic-ular land use within each county A worst-case sce-nario would result for a county that had only one ortwo small urban polygons These polygons wouldstill receive 70 percent of the data for the entirecounty causing the urban areas in that county tohave unusually high densities and the other land-use areas to have lower densities

Method G2 represents a grid-based implemen-tation of method P2 As with the polygon versionthe method used a weighting scheme to assign datato different land-use classes within each countyFor counties with all three inhabitable land uses70 percent of county data were assigned to urbangrid cells 20 percent to agriculturalwoodland cellsand 10 percent to forested cells The method suf-fers from the same weaknesses as the P2 methodsubjectivity of the weightings and potential prob-lems with very small areas of a particular land usein a county As with Gl a major strength of thegrid three-class method was its easier implementa-tion in Avenue

Limiting Variable Method (P3)The final polygon method is the pure form of thelimiting variable method described by McCleary(1969) Wright (1936) also made use of this con-cept as did Gerth (1993) and Charpentier (1997)in their GIS implementations The first step in thismethod was to assign data by simple areal weight-ing to all inhabitable polygons in each county Forthe total population variable data were assignedso that the three inhabitable land-use types withina county (urban agriculturalwoodland and for-ested polygons) had equal population densitiesAt this step water was limited to zero densityNext we set thresholds of maximum density forparticular land uses and applied these throughoutthe study area For example agriculturalwoodlandareas were limited to 50 people per square kilome-ter and forested areas were assigned a lower thresh-old of 15 people per square kilometer for the totalpopulation variable we mapped degThefinal step inthe mapping process was use of these thresholdvalues to make adjustments to the data distributionwithin each county If a polygon density exceededits threshold it was assigned the threshold densityand the remaining data were removed from thearea These data were distributed evenly to theremaining zones in the county (Gerth 1993)

To decide the upper limits on the densitiesof the mapped variable for agriculturalwoodland

130

and forested land uses we examined data availableat the county level Eighteen counties in the data-set were classed entirely as agriculturalwoodlandand we used these to set thresholds For each ofthe sixvariables the agriculturalwoodland thresh-old was based on the value for the sixth highestcounty (70th percentile) The forested thresholdwas chosen using the same list of counties andbased on the value of the fourth lowest (20th per-centile) countyThis approach resulted in differentthresholds for each variable-a systematic custom-ization necessitated by differing magnitude rangesamong variables Only one county in the studyarea was fully forested yielding insufficient infor-mation for the selection of the maximum densitythreshold for forested polygons

Calculation and Analysis of ErrorMeasuresA varietyof waysof measuring error have been usedin areal interpolation research We chose to followFisherand Langford (1995)in their use of root meansquared (RMS)error and a coefficientof variation todescribe errors in dasymetric zones Other research-ers have used mean percent error and mean abso-lute percent error to summarize error (Goodchildetal 1993)We chose RMS error because it could beapplied to count data (eg total number of people)and waseasily interpreted as a value with the sameunits as the mapped variable

We began by calculating the RMS error foreach county using errors for dasymetric zoneswithin the countyThis calculation had three basicsteps1 Calculate the square of the difference between

the correct and interpolated counts for eachdasymetric zone

2 Calculate the mean of these squared errorswithin a county and

3 Take the square root of the mean to producea summary county error measure the RMSerrorThe correct data values for dasymetric zones

(needed for step 1) were calculated by summingpopulation or housing units for block groupswithin each dasymetric zone The coefficient ofvariation for each county was calculated by divid-ing the RMSerror by the correct county data value(number of people or housing units in 1990)Thisstandardized value allowed for direct comparisonacross the six mapped variables

A repeated-measures analysis of variance(ANOVA)wasused to test for significant differencesbetween mean coefficients of variation among thefive mapping methods Coefficients of variation

Cmtography and GeograPhic Information Science

Land use

D Agricultural

Woodland

Forested

bull Urban

Figure 3 Aggregated land-use data for study area

were the repeated measures compared within coun-ties (159 counties were the subjects in the within-subjects parlance of repeated-measures AiTOVA)

Results and Discussion

Statistical Analysis of ErrorsTable I lists means of coefficients of variation for159 counties for each of 30 maps examined in theanalysis Dasymetric methods are ranked by theright-most column that lists overall means for therows P3 had the smallest mean coefficient of varia-tion followed by P2 PI G2 and G1 This rankorder is consistent between variables with a fewtied coefficients and only a single pair of methodschanging places within anyone column For exam-ple G2 is ranked before PI in the first two housingvariables P3 is ranked first for all of the variables

Vol28 No2

This sample of variables contains more housingvariables than population variables but the con-sistency in ran kings of the dasymetric methodswithin variables indicates that this imbalance inthe sample does not bias the overall results

The ANOVA showed that the dasymetricmethod had a significant effect overall (p = 0000F =] 1164 df = 245]) based on the Greenhouse-Geiser statistic (SPSS Inc 1999) Thus the nullhypothesis that there is no difference in the levelof error among the methods was rejected Table2 lists the mean coefficients of variation for eachmethod (from the right column of Table I) andshows confidence limits calculated around thosemeans Significant differences between pairs ofmean coefficients of variation were tested usingthe Bonferroni method because it accounts formultiple comparisons and has been shown to beuseful for large sample sizes (SPSS Inc) The pair-wise results (Table 3) show that P3 the limitingvariable method (which had the lowest error for allvariables) was significantly different from all othermethods Thus the limiting variable method (P3)performed well in comparison to the other meth-ods

Although no other pairings of methods showedsignificant differences the three-class methodsperformed slightly better than the correspondingbinary methods (P2 was better than PI and G2 wasbetter than Gl) Though this difference was notsignificant in our work (Table 3) the result was notconsistent with previous research that showed sim-pler methods performing better than more com-plex versions (see Eicher 1999 p 102)

For a straightforward comparison of errors weadapted GI and P2 to the alternative data struc-ture (G1 to polygon-based PI and P2 to grid-basedG2) This redundancy was important because thegrid methods produced generally higher errorsthat might have confounded comparison (eg G2had lower errors than GI but higher error thanPI) This approach also allowed us to examinewhether there was a significant difference betweenmethods that use polygon land-use data and thosemethods that use grid land-use data Methods PIand GI and methods P2 and G2 were directly com-parable PI had lower error than Gland P2 hadlower error than G2 though in both cases the dif-ferences were not significant (Table 3)

In sum the results of the repeated-measuresANOVA supported three conclusions The tradi-tional limiting variable method (P3) produceddasymetric maps with significantly lower meanerror than maps produced using all other methodsBased on pairwise comparison of method PI to P2

131

House ValuesOverall mean ofDasymetric Total Black

Method Population Population $40000middot $50000middot $60000middot CY for method

Ilt$25000 49000 59000 75000

IP3 047 064 054 049 045 049 051

(limiting)P2 062 070 065 064 063 064 065

(3-class) -Pl 062 075 071 067 064 064 067

lbinary)G2 064 080 067 066 065 066 068(3-c1ass) -Gl 063 090 072 068 065 080 073(binary)

Overall mean of 060 076 066 063 061 065 065cv tor variableNotes Entries in the center portion of the table are means of 159 county coefficients of variation (cv) for 30 individual maps Values inthe right-most column are overall means of cv of each dasymetric method and values in the bottom row are overall means of cv ofeach variable

Table 1 Summary of county-level coefficients of variation of RMS error measures

Method Mean CY 95 Confidence IntervalP3 (limiting) 051 043 - 059P2 (3-c1ass) 065 055 - 074Pl (binary) 067 058 - 077

G2 (3-c1ass) 068 058 - 079Gl (binary) 073 064 - 082

tionships error maps will be presented allowingvisual analysis of spatial error patterns Presenta-tion of this type of data in tandem with the dasy-metric maps may generate hypotheses and ideasfor future research

Example Map SeriesFigures45 and 6 present (a) dasymetric maps (b)error maps based on percent error and (c) errormaps based on count error for each of the three

P3P3 P2

PlP2 0001 G2Pl 0000 1000G2 0000 0313 0059Gl 0000 0609 1000 1000

sig nificant differences

Table 3 Probabilities for differences in errors for pairs of dasymetricmethods

and method G1 to G2 thethree-class methods werefound to perform slightlybetter than the binary meth-ods though the overall dif-ference was not significantBased on pairwise compar-ison of method PI to G1 cv = coefficient of variation of RMS county errors

and method P2 to G2 poly- Table 2 Estimated mean errors and confidence intervals for dasymetric methodsgon methods were found toproduce slightly more accu-rate maps than grid methods thoughthe difference was not significant

Many of the papers we reviewed for thiswork especially research on areal inter-polation lacked a visual presentation ofresults Papers with maps (eg Langfordet al 1990 Fisher and Langford 1996Charpentier 1997) typically focused onsmall study areas We now present dasymetricmaps resulting from the three polygon methodstested in this experiment (see also Eicher (1999)for corresponding grid-based maps)

Some of the work in areal interpolation soughtto describe the types of error produced (Cockingset al 1997 Langford et al 1991) For examplethese studies formally tested the connections ofgeometry and attribute characteristics to map errorWhile this section will not formally test these rela-

Visual Analysis of DasymetricMaps and Errors

132 CartograPhy and GeograPhic Information Science

polygon mapping methods The dasymetric mapsare density representations of the total populationvariable Boundaries for all dasymetric zones areshown on these maps Percent error maps showerror relative to the total population of a polygonOne possible weakness of percent error maps isthat rural areas generally will have higher percenterrors because of lower totals within these zonesFurthermore when presented in map form thesehigher errors dominate because rural map zonestend to be larger than urban map zones

Count errors are presented to provide a viewof the absolute error present across the study area(map C in Figures 4 5 and 6) Maps of count errorwill not signal high errors in rural areas with lowpopulations as seen on the percent error mapsWhen count errors are mapped the reader getsmore of an impression of the overall quantity ofthe variable that has been mapped inaccurately Amajor weakness of presenting count data with arealfills however is that larger polygons will tend tohave higher values than smaller polygons with sim-ilar characteristics Despite this recognized prob-lem with mapping counts consistent symbolizationwas maintained to aid comparison with the othertwo map types (a and b) within the figures

Data ranges for classifications are sharedamong corresponding maps in Figures 4 5 and 6to aid comparison The extreme classeshave variedmaximums because the calculated maximum pop-ulation density varied widely depending on themapping method Note that adjacent classes in leg-ends share data breaks namely the upper valuefor the first class (25 personskm2) is the same asthe lower value for the second class the uppervalue for the second class (75 personskm2) is thesame as the lower value for the third class etcEach class should be interpreted as encompassingall values up to but not including the highest valuelisted in the legend for the class This method ofdescribing class breaks allows precise definition ofbreaks without the use of excessiveprecision in sig-nificant digits that clutter a legend and make mapuse difficult (Monmonier 1982)

Binary Method (Pl)The dasymetric map in Figure 4a clearly shows thebinary distinction made in this method betweeninhabitable and uninhabitable areas All areas des-ignated as forested (a large portion of the map)have been assigned zero population The distinc-tion is most obvious in the ridge-and-valley phys-iographic province running diagonally across thestudy region where ridges appear as white (for-ested areas that are assigned no population) and

11Jl 28 No2

valleys are yellow-greens (populated urban andagriculturalwoodland areas)

The percent error map (Figure 4b) as expectedshows under-prediction (grays) for the forestedareas although urban areas have also been under-predicted The latter occurs because urban areasare not distinguished from agriculturalwoodlandareas by the method receiving the same populationdensity when they are in the same county Manyagriculturalwoodland areas are over-predicted forthe same reason especially in areas boundingmajor metropolitan areas (ashington BaltimorePittsburgh) The count error map (Figure 4c) showsslightly different patterns of error Most interest-ing are the relatively low errors in rural forestedareas Though all these areas have -100 percenterror (because they were assigned zero popula-tion) counts are comparatively low because popu-lations are low

Three-Class Method (P2)The dasymetric map in Figure 5a shows that nearlyall areas received population with the exception ofwater The map also distinguishes between urbanand agriculturalwoodland and thus urban placeslike York Erie and Harrisburg (see Figure 1 forlocations) have visiblyhigher populations than thesurrounding rural areas

The percent error map (Figure 5b) shows thaturban areas are either predicted correctly or under-predicted (eg Pittsburgh Washington Baltimore)This map also shows that forested areas are under-predicted especially in the ridge-and-valley prov-ince Conversely the valleys typically classifiedas agriculturalwoodland are over-predicted Theunder-prediction of forested areas in most countiessuggests that not enough population was assignedto the forested land use in the 70-20-10 percentweighting scheme Again the map of count error(Figure 5c) shows lower errors throughout theridge-and-valley province because populations arelower in the mostly rural area

Limiting Variable Method (P3)The dasymetric map produced using the P3method (Figure 6a) which had the lowest overallerror closely resembles the result from method P2which had the next best error measures Slight dif-ferences in dasymetric map patterns for the twomethods are noticeable in northwest Pennsylvaniaalong the top edge of the study area as well as incentral Virginia along the southern edge of themap The limiting variable method is also similarto method P2 in that few areas of the map havezero population

133

a Population Density b Percent Error c Count Error(persons per km2)

050 to 37233 20000 to 38852

25 to 50 10000 to 20000o to 25

10 to 25 5000 to 1000025 to 75

-10 to 10 -5000 to 5000

75 to 175-25 to -10 -10000 to -5000

175 to 375 -50 to -25 -20000 to -10000

375 to 13551 -100 to -50 -395877 to -20000

Figure 4 Maps for binary method (Pl) a dasymetric population density map b percent error and c counterror in number of persons

The pattern of error for the P3 method (Fig-ures 6b and 6c) is different however from thefirst two methods An interesting radial pattern isevident around both Washington and BaltimoreThese areas are unique because they have large met-ropolitan populations and the municipal boundar-ies of each designate them as independent citiesessentially small counties in this analysis so thesecentral cities are well predicted Immediately out-ward however are over-predicted areas followedby a ring of under-predicted areas at the fringesof the metropolitan areas The boundary betweenover-prediction and under-prediction is the func-tional border between urban and agriculturalwoodland land use The Pittsburgh area has asimilar pattern with a central urban area that iswell predicted surrounded by areas of agriculturalwoodland land use that are under-predicted

134

For major metropolitan areas we concludethat the threshold value for agriculturalwoodlandareas was set too low As a result in the case ofWashington the outermost suburbs (classified asagriculturalwoodland) received the threshold den-sity of 15 persons per km2 while too much datawas then carried over and assigned to more trulysuburban areas (classified as urban) closer to thecity center In the case of Pittsburgh a smallercity suburbs immediately proximate to the citywere classified as agriculturalwoodland assignedthe corresponding agriculturalwoodland densitythreshold and therefore under-predicted The ageof the land-use data could also be a factor inboth cases because areas classified as agriculturalwoodland may have been developed between 1970and 1990

CartograPhy and GeograPhic Information Science

a Population Density b Percent Error c Count Error(persons per km2)

050 to 127028 20000 to 26418

25 to 50 10000 to 20000o to 25

10 to 25 5000 to 1000025 to 75

-10 to 10 -5000 to 5000

75 to 175-25to-10 -10000 to -5000

175 to 375 -50 to -25 -20000 to -10000

375 to 16525064 -100 to -50 -288114 to -20000

Figure 5 Maps for three-class method (P2) a dasymetric population density map b percent error and ccount error in number of persons

Conclusions and Future Research

Research Question A Mapping MethodsThere were significant differences in accuracy forthe five dasymetric mapping methods tested inthis experiment The limiting variable method(P3) produced maps with significantly lower errorthan maps made using the other methods Three-class methods (P2 and G2) produced maps witherrors that were not significantly different thanthe errors of maps produced with binary methods(PI and GI)

Though the traditional limiting variablemethod (P3) is over a half-century old (Wright1936) we have shown that it can be implementedin a GIS to produce accurate dasymetric mapsThe success of this method may in part be due

11il28 No2

to its customized approach Threshold values usedto shift data between zones were based on thedata distribution for each mapped variable Thisapproach contrasts with three-class methods P2and G2 in which the same 70-20-10 percentageweightings were applied to all variables Futurework could examine the effect of using differentthreshold values for the land-use classesAlso impor-tant are methods of determining the P3 thresholdvalues Perhaps the reassignment schemes used forP3 could also be made less arbitrary and morespecific to the geography of each source zone bymaking the thresholds dependent on the land-useareas within each source zone (Mennis 1999) Forexample a threshold of 20 people per square kilo-meter used for agriculturaVwoodland areas couldbe increased to 30 in counties with more than 50

135

percent urban land uses In the future a standard-ized and generalizable decision process with a sta-tistical basis might be developed for this purpose

Research Question B Polygon Methodsversus Grid MethodsBased on a pairwise comparison of polygon andgrid versions of the binary method (PI and GI)and the three-class method (P2 and G2) errors forpolygon dasymetric maps were not significantlydifferent than for grid maps In our implementa-tion the use of grid methods was more convenientWithin the ArcView environment grid (raster) ver-sions of our datasets took up less storage spaceand the grid methods were easier to implementusing Avenue scripting Processing was also fasterin ArcViewbecause shorter scripts were typical ofthe grid methods (Eicher 1999) Grid methodswere developed in the areal interpolation litera-ture but have not been popular for dasymetricmapping Their effectiveness in producing dasy-metric maps in this experiment as well as theirease of implementation suggests that they shouldbe used more frequently In contrast advantagesof the equally accurate polygon methods includegreater flexibility for publication-quality mappingsuch as ease of adjusting line weights to suit differ-ent presentation sizes

Future work could study the effect of grid cellsize on the accuracy of dasymetric maps The rela-tionship between cell size and map scale is espe-cially interesting A research project focusing ongrid dasymetric maps could determine optimalgrid cell sizes for varied map scales For examplewe expect that use of excessively small grid cellsfor a given scale would lead to higher map errorSuch research would also provide an analyticalmethod of balancing increases in accuracy affordedby smaller grid cells with the increased storagerequirements and processing times associated withlarger data files See Eicher (1999) for further dis-cussion of this topic

Research Question C Dasymetric Mapsand Error MapsVisual inspection of dasymetric maps and error mapssupported the results from the statisticalanalysispor-tion of the experiment and offered additional insightinto the usefulness of dasymetric maps It is our opin-ion that the resulting dasymetric maps are visuallyeffective showing more accurate distributions thancounty aggregations provide

Classification (the data ranges represented bymap colors) for dasymetric maps was not closely

136

examined in this work Future work could investi-gate whether classifications that have been shownto work well for choropleth maps also work wellfor dasymetric maps An interesting possibility isthe use of contiguity-based classifications (Cromley1996) which might make dasymetric maps visually

more dasymetric Contiguity-based classificationsconsider differences across boundaries betweenadjacent polygons and this strategy is particularlywell suited for application to boundaries on dasy-metric maps Dasymetric boundaries are moreclosely related to the underlying data distributionthan boundaries on choropleth maps because dasy-metric zones are more internally homogeneous

Many possibilities exist for extending calcula-tion and comparison of error in dasymetric mapsFuture work could include a number of between-county factors or covariates in the repeated-mea-sures ANOVAThese could test the relationshipbetween map error and land-use areas popula-tion densitygeographic location and other factorsAlso different approaches could be taken to evalu-ating the quality of dasymetric maps in an exper-iment A defining property of dasymetric mapsis homogeneity within map zones however thisproperty was not formally tested in this researchIt may be possible to compute the variance foreach map zone using values from the block groupscomprising that zone Variances within each mapzone could then be compared to variances betweenmap zones to establish a measure of dasymetricity(homogeneity within zones combined with distinct-ness of differences between adjacent zones) To fur-ther examine dasymetric mapping methods andthe properties of dasymetric maps a number ofdifferent map zone configurations could be usedFor example maps resulting from different ancil-lary variables such as land use soil types or slopezones could be compared Likewisemaps resultingfrom different source zones such as counties votingdistricts and economic planning areas could becompared

SummaryWith a structured experiment we determinedthat the traditional limiting variable method wasthe most accurate of those tested and that griddasymetric mapping methods produced maps ofsimilar accuracy to their polygonal counterpartsWrights limiting variable concept developed 65years ago has maintained its utility This tradi-tional method has been updated and adapted suc-cessfully to produce dasymetric maps in a GISmapping environment Additionally the availabil-

CartogmPhy and Geogmphic Information Science

ity of GIS technology has made data processingfor dasymetric map production a more pragmaticprocess Work (such as this paper) on dasymetricmapping research on areal interpolation andincreased use of GIS by cartographers are firststeps toward producing a formal set of dasymetricmapping methods that updates George McClearys32-year-old typology GIS software packages mayeventually offer full dasymetric mapping function-ality or at least provide the means to make thesetypes of maps more easily

ACKNOWLEDGEMENTSSpecial thanks must be given to Jeremy Mennisand Donna Peuquet for their comments and con-structive criticism on this research We appreciateTerry Slocums careful edits of the manuscript

REFERENCESAnderson J R 1967 Major land uses in the United

States In US Geological SurveyNational atlas of theUnited States of America U S Geological SurveyWash-ington DC pp 158-9

Bloom L M PJ Pedler and G E Wragg 1996 Imple-mentation of enhanced areal interpolation using Map-info Computers amp Geosciences 22(5) 459-66

Charpentier M 1997 The dasymetric method with com-ments on its use in GIS In 1997 UCGISAnnualAssem-bly amp Summer Retreat The University Consortium forGeographic Information Science June 15-20 BarHarbor Maine

Chrisman N R 1997 Exploring geograPhic informationsystems New York New YorkJohn Wileyamp Sons

Chrisman N R 1998 Rethinking levels of measure-ment for cartography Cartography and GeograPhicInformation Systems 25(4) 231-42

Cockings S P Fisher and M Langford 1997 Param-eterization and visualization of the errors in arealinterpolation GeograPhical Analysis 29(4) 314-328

Cromley R G 1996 A comparison of optimal classifica-tion strategies for choroplethic displays of spatiallyaggregated data International journal of GeograPhicalInformation Systems 10(4) 405-24

DeMers M N 1997 Fundamentals of geograPhic informa-tion systems New York New YorkJohn Wileyamp Sons

Eicher C L 1999 Implementation and evaluation ofdasymetric mapping methods MS Thesis Pennsyl-vania State University

Fisher Pand M Langford 1995 Modeling the errors inareal interpolation between zonal systems by MonteCarlo simulation Environment and Planning A 27211-24

Fisher Pand M Langford 1996 Modeling sensitivity toaccuracy in classified imagery A study of areal inter-polation by dasymetric mapping Professional Geogra-pher 48(3) 299-309

Flowerdew R and M Green 1989 Statistical methodsfor inference between incompatible zonal systemsIn M Goodchild and S Gopal (eds) Accuracy of spa-

Vol 28 No2

tial databases London UK Taylor and Francis pp239-47

Flowerdew R and M Green 1991 Data integrationStatistical methods for transferring data betweenzonal systems In l Masser and M B Blakemore(eds) Handling geographical information Essex UKLongman Scientific amp Technical pp 38-53

Flowerdew R and M Green 1992 Developmentsin areal interpolation methods and GIS Annals ofRegional Science 26 67 -78

Flowerdew R and M Green 1994 Areal interpolationand types of data In S Fotheringham and P Rog-erson (eds) Spatial analysis and GIS London uKTaylor and Francis pp 121-45

Flowerdew R M Green and E Kehris 1991 Usingareal interpolation methods in GIS Papers in RegionalScience The journal of the RSAI 70(3) 303-15

Gerth J 1993 Towards improved spatial analysis withareal units The use of GIS to facilitate the creation ofdasymetric maps Masters Paper Ohio State Univer-sity

Goodchild M F and N S Lam 1980 Areal interpola-tion A variant of the traditional spatial problem Geo-processing I 297-312

Goodchild M F L Anselin and U Deichmann 1993A framework for the areal interpolation of socioeco-nomic data Environment and Planning A 25 383-97

Holloway S R J Schumacher and R Redmond 1996People and place Dasymetric mapping using Arc)Info Missoula Wildlife Spatial Analysis Lab Univer-sity of Montana

Hitt K J 1991 Digital map file of major land uses inthe United States US Geological Survey AccessedOctober 1998 ArcInfo coverage available [romhttpwa ter usgs govG ISmetadatausgswrdllandusehtml

Langford M D J Maguire and D J Unwin 1990Mapping the density of population Continuous sur-face representations as an alternative to choroplethand dasymetric maps Leicester UK MidlandsRegional Research Laboratory University of Leicesterand Loughborough University of Technology UK

Langford M D J Maguire and D J Unwin 1991The Areal interpolation problem Estimating popula-tion using remote sensing in a GIS framework Inl Masser and M B Blakemore (eds) Handling geo-graPhic information Essex UK Longman Scientificamp Technical pp 55-77

Langford M and D J Unwin 1994 Generating andmapping population density surfaces within a geo-graphical information system The Cartographic jour-nal31 21-6

MacEachren A M 1994 Some truth with maps A primeron symbolimtion and design Washington DC Associa-tion of American Geographers

McCleary G FJr 1969 The dasymetric method in the-matic cartography PhD Dissertation University ofWisconsin at Madison

McCleary G FJr 1984 Cartography geography andthe dasymetric method In Proceeding 12th Conferenceof International CartograPhic Association August 6-13Perth Australia 1599-610

137

c Count Error

20000 to 47335

10000 to 20000

5000 to 10000

-5000 to 5000

-10000 to -5000

-20000 to -10000

-382151 to -20000

Figure 6 Maps for traditional limiting variable method (P3) a dasymetric population density map bpercent error and c count error in number of persons

Mennis J 1999 Using GIS spatial statistics and visual-ization to investigate environmental justice Paper pre-sented at the Joint Statistical Meetings American Statisti-cal Association Baltimore Md August 8-12 1999

Monmonier M 1982 Flat laxity optimization androunding in the selection of class intervals Cartograph-ica 19(1) 16-27

Monmonier M and G A Schnell 1984 Land use andland cover data and the mapping of population den-sity International Yearbook of CartograPhy 24 24-9

Moxey A and P Allanson 1994 Areal interpolation ofspatially extensive variables A comparison of alterna-tive techniques International journal of GeograPhicalInformation Systems 8(5) 479-87

138

Robinson A J Morrison P Muehrcke A Kimerlingand S Guptill 1995 Elements of cartograPhy 6th edNew York New York John Wiley amp Sons

Slocum T A 1999 Thematic cartography and visualimtionUpper Saddle River New Jersey Prentice Hall

SPSS Inc 1999 SPSS help venion 90 SPSS Incorpo-rated Chicago Illinois

US Geological Survey 1970National atlas of the United Statesof Ame1Ua us Geological Surveyashington DC

US Geological Survey 1993 National water summary1990-91 Hydrologic events and stream water qualityUS Geological Survey water supply paper 2400 USGeological Survey Washington DC

Wright I K 1936 A method of mapping densities ofpopulation The GeograPhical Review pp 103-10

CartograPhy and Geographic Information Science

Page 2: 50;93. *-88361 -6/ 90-4 (6;09874-;376! (5840506;-;376 … lit...%-:?50;93. *-88361 -6/ "90-4 (6;09874-;376! (5840506;-;376 -6/ &=-4

Wright and McCleary and offers insight into thepractical challenges of producing dasymetric mapsusing digital data

The use of geographic information systems(GIS) for modern mapping has renewed interestin dasymetric mapping (eg DeMers 1997 Chris-man 1997) The lack of standardization in produc-tion methods is however an obstacle that preventswidespread use of dasymetric maps in GIS Addi-tionally most production methods have not beenevaluated through systematic testing (Robinsonet aI 1995) Recent research on areal interpo-lation (sometimes referred to as cross-area esti-mation) offers a possible solution to these prob-lems Though areal interpolation research hasnot focused on map production it can advancethe study of dasymetric mapping because it hasinvolved structured evaluation of methods that arerelated in their calculation approaches

Areal interpolation is the process of transfer-ring spatial data from one set of units to another(Bloom et aI 1996 Fisher and Langford 1996)and is often used to compare multiple datasetseach collected using different enumeration unitsFor example US socio-economic and populationdata are reported by census tract whereas market-ing data may be available by zip code and pollu-tion data may be collected using watersheds as enu-meration units Areal interpolation may be used totransfer all data to a common set of enumerationunits (eg watersheds) to permit efficient compari-son and analysis

Many areal interpolation methods can be incor-porated into dasymetric mapping methods toimprove the detail of a choropleth map below thelevel of the enumeration unit (Fisher and Lang-ford 1996) The primary difference between thetwo approaches is that dasymetric mapping omitsthe final step in areal interpolation of re-aggre-gating to a preferred enumeration unit type (seeEicher (1999) for a review of areal interpolationmethods) A common goal of most work on arealinterpolation has been evaluation of accuracy ofinterpolation methods This goal contrasts withthe largely demonstrative goals of cartographicwork on dasymetric mapping

One of our goals was to build connectionsbetween areal interpolation and dasymetric researchThe technique we selected from the areal inter-polation literature the binary method was oneof the most accurate in previous testing Thebinary method was also chosen because only minorchanges would be needed to implement it as thecore of a dasymetric mapping procedure Simpleareal weighting is another common areal interpola-

126

tion method (Goodchild and Lam 1980Flowerdewand Green 1989 Goodchild et aI 1993 Fisher andLangford 1996 Cockings et al 1997) Simple arealweighting could not however be applied to pro-duce dasymetric maps (at least within the frame-work of this experiment) because it does not useancillary variables We also do not report on theapplication of regression methods of areal interpo-lation These methods are discussed in many ofthe papers cited as well as in Moxey and Allanson(I 994) Flowerdewet aI (1991) and other work byFlowerdewand Green (1991 1992 1994) Regres-sion methods use global information for the entirestudy area to estimate zone values and are rela-tively complex when compared to techniques thathave traditionally been used to create dasymetricmaps in production cartography

Dasymetric mapping and the dasymetricmethod have become ambiguous terms with theinception of areal interpolation research In thatliterature the dasymetric method describes a spe-cificform of areal interpolation that wewill refer toas the grid binary method (Langford et al 199IFisher and Langford 1996 Cockings et aI 1997)In contrast cartographers use the term dasymet-ric to refer to a general type of thematic mapproduced using a variety of methods Adding fur-ther confusion to the issue the term dasymetriccould be used to describe most forms of areal inter-polation especially intelligent areal interpolationmethods These techniques build from Wrights(1936) idea of reassigning data between units andthey incorporate his approach of using additionalinformation to perform this operation Chrisman(1998) suggests that Wrights technique was actu-ally a form of areal interpolation

MethodsFivecriteria guided dasymetric map production forthe experimental evaluation we undertook Thefirst twocriteria are basic requirements of dasymet-ric mapping that distinguish it from choroplethand isopleth mappingbull Zone homogeneity Final dasymetric map

zones were internally homogeneousbull Abrupt boundaries Zone boundaries repre-

sented escarpments or marked changes inthe mapped variable

bull Disaggregation of source zones Some arealdisaggregation of the original or sourcezones was involved (counties served as thesource zones) Source zones were not smallenough to allow for the creation of a dasy-metric map merely through their re-aggre-gation

CartogmPhy and GeogmPhic Information Science

Figure 1 Dasymetric map zones that result from overlayof county and land-use boundaries in the study area Stateboundaries and example major cities are included forreference

bull Intelligent interpolation An areal interpolationwas performed transforming socio-economicdata from one set of zones (source counties) tonew map zones defined by the intersection ofcounties with land-use polygons This interpola-tion was intelligent because it was guided byuse of ancillary land-use data in contrast to theuse of simple areal weighting

bull Nested data The experiment was performed ata scale that allowed checking of mapped values(derived from county data) against enumeratedvalues (summed from block group data) Useof these nested county and block group dataallowed statistical and visual evaluations of theresulting dasymetric maps and errorsThe experiment wasstructured to answer the follow-

ing three questionsA Are there significant differences in the accuracy of

dasymetric maps produced using the five methodstested in the experiment

B Are there significant differences in the accuracy ofpolygonal dasymetric maps versus those based ongridded data

C On a more qualitative levelwhat is the visualcharac-ter of the dasymetric maps produced for the experi-ment and what are the patterns of error on thesemaps

Vol 28 No2

In the following sections we first describe thestudy area and explain our use ofthe land-use data-set in the mapping process Secondly we describesix census variables from which we made the testset of dasymetric maps Then we outline the gen-eral mapping procedure and explain the five dasy-metric mapping methods tested We tested meth-ods derived from traditional cartographic work aswell as an areal interpolation method not specifi-cally designed to be applied in a cartographic con-text Finally we describe our methods for calculat-ing map errors that were used in the analysis of thetest maps

Study AreaThe study area encompassed 159 counties inparts of four states-Pennsylvania West VirginiaMaryland Virginia-and the District of ColumbiaFigure 1 shows a map of the study area with aselection of major cities and states labeled A largenumber of counties were included to allow forparametric analysis of the results using countiesas observations In Virginia cities are treated asdistinct municipal units and so these smaller poly-gons were included as individual observations inthe analysis The varied character of the 159 coun-ties provided a rigorous demonstration of themethods The chosen study area was also appro-priate because it contained a variety of land-useclasses and spanned a number of major landformregions including the coastal plain ridge-and-val-ley province and Allegheny plateau We expecteddifferent population patterns among these regionsFinally for each mapped variable a wide rangeof values were present within the study area Forexample population density varied widely withinthe study area from high-density urban areassuch as Washington and Baltimore to low-densityrural areas in West Virginia and north-centralPennsylvania Similar variations can be foundfor black population and housing-value variables(Figure 2)

Though it would have been convenient to mapa combination of full states the study areas rect-angular shape lends itself well to compact displayThe study area was also designed to avoid theextremely large metropolitan areas of New York(which might overly skew error summaries) whilestill including major metropolitan areas such asWashington Baltimore Pittsburgh and Richmond(Figure 1) Also the Chesapeake Bay shoreline wasmostly removed to avoid difficulties in overlayingislands and narrow peninsular polygons Lastlypersonal familiarity with the study area allowed usto visually check results for realistic values which

127

15 ID 175

2106

375103424

17510 315

30 10 2231

1410 30

0102

b

251075

a

Total Population(dens ly per km)

~ 01D25

610 14

BI ek Populationdllnslly per 11m

was important in implementing several of the map-ping methods

A necessary element for creating dasymetric mapsfrom choropleth maps is an ancillary dataset Thisinformation is used to assist interpolation of datafrom the original zones (counties) to new mapzones For this experiment land-use data wereused in both polygon and grid form Monmonierand Schnell (1984) also used land-use data toproduce a dasymetric map Our land-use datasetwas obtained from the US Geological Survey inthe form of a digital polygon covering the entirecountry at 17500000 scale These data by Hitt(1991) were created from a map of major landuses (Anderson 1967) included in the NationalAtlas (US Geological Survey 1970) and were com-pared to more recent state-level land-use maps(USGeological Survey 1993) We clipped the datafor the study area using ArcViews GeoprocessingExtension to create approximately 330 polygonsThe grid dataset was created from the polygondataset using ArcViews Convert to Grid opera-tion A grid cell of lkm2 was chosen because we feltit was suitably sized larger than most census blockgroups (used for checking the maps) and smallerthan the dasymetric zones (shown in Figure 1)

Land-use Dataset

Mapped VariablesDasymetric mapping is most commonly applied topopulation density For this experiment six socio-economic variables reflecting different aspects ofpopulation distribution were mapped All datawere obtained from the 1990 US Census Twopopulation variables were chosen total popula-tion and black population Four housing variableswere also mapped counts of housing units valuedat less than $25000 $40-49000 $50-59000and $60-75000 These six-count variables weremapped as densities of people or housing unitsAltogether 30 dasymetric maps were tested sixmaps for each of five methods Our goal was tomake conclusions about the accuracy of the meth-ods using a sample of mapped variables We feltthat testing with a sample would produce strongerresults than relying on differences with a singlevariable which was usual in previous research

The six variables were selected to obtain aseries of original maps that had a variety of spa-tial patterns Figure 2 shows pattern differences oncounty choropleth maps Socio-economic variablesare generally highly correlated and our variablesare no exception Thus there are shared character-

cNumber of HomosValued atl bullthan $25000(deNlty per km)

00210 017

0171D 0 30

031 ID 055

0581D 222

224 ID 4249

Figure 2 Example choropleth county maps of three variablesused in dasymetric map evaluation

isticsin these map patterns Weattempted to selectpatterns that were distinct from the overall popula-tion pattern in at least some areas such as differ-ent clusterings of high values in the north southand through the ridge-and-valley region (runningnortheast to southwest across the study area) Thecorrelation between the density of total popula-

128 Cmtography and GeograPhic 1nformation Science

tion and density of black population was 070The correlations between total population and thefour housing variables ranged from 060 to 079while the correlations between black populationand the housing variables ranged from 027 to047 Among the four housing variables correla-tions ranged from 081 to 099

We selected data that were available for coun-ties and block groups-two nested scales of censusenumeration In the experiment dasymetric mapswere created using the county-level data for themapped variables Data available at the block-group level were then used to check the resultsof the dasymetric mapping methods Over 13000block groups were contained in the study area andwere used to calculate errors for 580 dasymetricmap zones

Dasymetric Mapping MethodsThree core methods from the dasymetric and arealinterpolation literature were tested Two methodswere based on dasymetric mapping research thepolygon three-class method and the traditionallimiting variable method The polygon three-classmethod was used to produce a series of maps in anacademic production cartography lab (Hollowayetal 1996) while the traditional limiting methodhas roots in the 65-year-old heavily cited paperby Wright (1936) The grid binary method whichuses raster land-use data was selected from theareal interpolation literature

The dasymetric methods were vector basedand the areal interpolation method was grid basedTo provide a balanced experiment we tested twoother methods that were adaptations of these previ-ously existing methods a grid three-class method(complementing the polygon three-class version)and a polygon binary method (complementing thegrid binary version) They were developed for thisresearch and have not been tested in the publishedliterature Wefelt it was necessary to test both gridand polygon versions of the methods to be ableto attribute differences in performance to differ-ences in the methods without worrying that thedifferences might be amibutable to data structuresThe limitingvariable method isa polygon dasymetricmethod whichwe did not adapt for grid processing

The following sub-sections describe the fivemethods we tested We pair related grid andpolygon methods in the descriptions listing themethod that originates from the literature beforethe derivative method (eg Gl before PI and P2before G2) Methods were implemented in ArcViewusing Avenue scripting language (see the Appen-dix in Eicher (1999) for scripts)

Vol28 No2

Grid and Polygon Binary Methods (GI and PI)For the grid binary method (Gl) 100 percentof the data in each county was assigned to onlythe urban and agriculturaVwoodland cells derivedfrom the raster land-use ancillary dataset No datawere assigned to the other classes (forested water)hence the label binary The primary advantage ofthis method was its simplicity It was only necessaryto reclassifYthe land-use data into two classes onefor inhabitable areas and another for uninhabit-able areas This subjective decision was dependenton the land-use classification set as well as informa-tion known about the mapped area

The binary method was originally developedfor use as a mapping technique (Langford et a1990 1991 Langford and Unwin 1994) but even-tually was applied in areal interpolation problems(Fisher and Langford 1995 Cockings et al 1997)It represents a specialized form of the limiting vari-able method that McCleary (1969) described Allland uses except urban and agriculturalwoodlandin our implementation are limited to zero forpolygons deemed uninhabitable

The fundamental operations of the G1methodwere adapted to work with polygon (versus grid-ded) land-use data to create method Pl The poly-gon binary method also assigns 100 percent ofthe data in each county to only the urban andagriculturalwoodland areas within the county withpopulation or housing densities recalculated forthese new areas

Polygon and Grid Three-class Methods (P2 and G2)For the polygon three-class method (P2) we useda weighting scheme to assign population or hous-ing data to three different land-use classes withineach county urban agriculturalwoodland and for-ested For counties with all three land-use classespresent 70 percent of the data for each county wasassigned to urban land-use polygons 20 percent toagriculturalwoodland polygons and 10 percent toforested land Water received no data The percent-ages we selected were a subjective decision partlybased on a previous implementation by Hollowayet al (1996) who mapped population density forMissoula County Montana Working with censustracts as their choropleth units they assigned 80percent of each tract population to urban polygons10 percent to open polygons and 5 percent eachto agricultural and forested areas Our percentagebreakdown (70-20-10) was chosen because we esti-mated that a larger percentage of the mapped vari-able would be found in agriculturaVwoodland andforested areas in our study region than for theregion mapped by the Montana researchers For

129

this experiment adjustments in the weighting per-centages were not made for different mapped vari-ables the same scheme was used for all variables

A major weakness of the three-class method isthat it does not account for the area of each partic-ular land use within each county A worst-case sce-nario would result for a county that had only one ortwo small urban polygons These polygons wouldstill receive 70 percent of the data for the entirecounty causing the urban areas in that county tohave unusually high densities and the other land-use areas to have lower densities

Method G2 represents a grid-based implemen-tation of method P2 As with the polygon versionthe method used a weighting scheme to assign datato different land-use classes within each countyFor counties with all three inhabitable land uses70 percent of county data were assigned to urbangrid cells 20 percent to agriculturalwoodland cellsand 10 percent to forested cells The method suf-fers from the same weaknesses as the P2 methodsubjectivity of the weightings and potential prob-lems with very small areas of a particular land usein a county As with Gl a major strength of thegrid three-class method was its easier implementa-tion in Avenue

Limiting Variable Method (P3)The final polygon method is the pure form of thelimiting variable method described by McCleary(1969) Wright (1936) also made use of this con-cept as did Gerth (1993) and Charpentier (1997)in their GIS implementations The first step in thismethod was to assign data by simple areal weight-ing to all inhabitable polygons in each county Forthe total population variable data were assignedso that the three inhabitable land-use types withina county (urban agriculturalwoodland and for-ested polygons) had equal population densitiesAt this step water was limited to zero densityNext we set thresholds of maximum density forparticular land uses and applied these throughoutthe study area For example agriculturalwoodlandareas were limited to 50 people per square kilome-ter and forested areas were assigned a lower thresh-old of 15 people per square kilometer for the totalpopulation variable we mapped degThefinal step inthe mapping process was use of these thresholdvalues to make adjustments to the data distributionwithin each county If a polygon density exceededits threshold it was assigned the threshold densityand the remaining data were removed from thearea These data were distributed evenly to theremaining zones in the county (Gerth 1993)

To decide the upper limits on the densitiesof the mapped variable for agriculturalwoodland

130

and forested land uses we examined data availableat the county level Eighteen counties in the data-set were classed entirely as agriculturalwoodlandand we used these to set thresholds For each ofthe sixvariables the agriculturalwoodland thresh-old was based on the value for the sixth highestcounty (70th percentile) The forested thresholdwas chosen using the same list of counties andbased on the value of the fourth lowest (20th per-centile) countyThis approach resulted in differentthresholds for each variable-a systematic custom-ization necessitated by differing magnitude rangesamong variables Only one county in the studyarea was fully forested yielding insufficient infor-mation for the selection of the maximum densitythreshold for forested polygons

Calculation and Analysis of ErrorMeasuresA varietyof waysof measuring error have been usedin areal interpolation research We chose to followFisherand Langford (1995)in their use of root meansquared (RMS)error and a coefficientof variation todescribe errors in dasymetric zones Other research-ers have used mean percent error and mean abso-lute percent error to summarize error (Goodchildetal 1993)We chose RMS error because it could beapplied to count data (eg total number of people)and waseasily interpreted as a value with the sameunits as the mapped variable

We began by calculating the RMS error foreach county using errors for dasymetric zoneswithin the countyThis calculation had three basicsteps1 Calculate the square of the difference between

the correct and interpolated counts for eachdasymetric zone

2 Calculate the mean of these squared errorswithin a county and

3 Take the square root of the mean to producea summary county error measure the RMSerrorThe correct data values for dasymetric zones

(needed for step 1) were calculated by summingpopulation or housing units for block groupswithin each dasymetric zone The coefficient ofvariation for each county was calculated by divid-ing the RMSerror by the correct county data value(number of people or housing units in 1990)Thisstandardized value allowed for direct comparisonacross the six mapped variables

A repeated-measures analysis of variance(ANOVA)wasused to test for significant differencesbetween mean coefficients of variation among thefive mapping methods Coefficients of variation

Cmtography and GeograPhic Information Science

Land use

D Agricultural

Woodland

Forested

bull Urban

Figure 3 Aggregated land-use data for study area

were the repeated measures compared within coun-ties (159 counties were the subjects in the within-subjects parlance of repeated-measures AiTOVA)

Results and Discussion

Statistical Analysis of ErrorsTable I lists means of coefficients of variation for159 counties for each of 30 maps examined in theanalysis Dasymetric methods are ranked by theright-most column that lists overall means for therows P3 had the smallest mean coefficient of varia-tion followed by P2 PI G2 and G1 This rankorder is consistent between variables with a fewtied coefficients and only a single pair of methodschanging places within anyone column For exam-ple G2 is ranked before PI in the first two housingvariables P3 is ranked first for all of the variables

Vol28 No2

This sample of variables contains more housingvariables than population variables but the con-sistency in ran kings of the dasymetric methodswithin variables indicates that this imbalance inthe sample does not bias the overall results

The ANOVA showed that the dasymetricmethod had a significant effect overall (p = 0000F =] 1164 df = 245]) based on the Greenhouse-Geiser statistic (SPSS Inc 1999) Thus the nullhypothesis that there is no difference in the levelof error among the methods was rejected Table2 lists the mean coefficients of variation for eachmethod (from the right column of Table I) andshows confidence limits calculated around thosemeans Significant differences between pairs ofmean coefficients of variation were tested usingthe Bonferroni method because it accounts formultiple comparisons and has been shown to beuseful for large sample sizes (SPSS Inc) The pair-wise results (Table 3) show that P3 the limitingvariable method (which had the lowest error for allvariables) was significantly different from all othermethods Thus the limiting variable method (P3)performed well in comparison to the other meth-ods

Although no other pairings of methods showedsignificant differences the three-class methodsperformed slightly better than the correspondingbinary methods (P2 was better than PI and G2 wasbetter than Gl) Though this difference was notsignificant in our work (Table 3) the result was notconsistent with previous research that showed sim-pler methods performing better than more com-plex versions (see Eicher 1999 p 102)

For a straightforward comparison of errors weadapted GI and P2 to the alternative data struc-ture (G1 to polygon-based PI and P2 to grid-basedG2) This redundancy was important because thegrid methods produced generally higher errorsthat might have confounded comparison (eg G2had lower errors than GI but higher error thanPI) This approach also allowed us to examinewhether there was a significant difference betweenmethods that use polygon land-use data and thosemethods that use grid land-use data Methods PIand GI and methods P2 and G2 were directly com-parable PI had lower error than Gland P2 hadlower error than G2 though in both cases the dif-ferences were not significant (Table 3)

In sum the results of the repeated-measuresANOVA supported three conclusions The tradi-tional limiting variable method (P3) produceddasymetric maps with significantly lower meanerror than maps produced using all other methodsBased on pairwise comparison of method PI to P2

131

House ValuesOverall mean ofDasymetric Total Black

Method Population Population $40000middot $50000middot $60000middot CY for method

Ilt$25000 49000 59000 75000

IP3 047 064 054 049 045 049 051

(limiting)P2 062 070 065 064 063 064 065

(3-class) -Pl 062 075 071 067 064 064 067

lbinary)G2 064 080 067 066 065 066 068(3-c1ass) -Gl 063 090 072 068 065 080 073(binary)

Overall mean of 060 076 066 063 061 065 065cv tor variableNotes Entries in the center portion of the table are means of 159 county coefficients of variation (cv) for 30 individual maps Values inthe right-most column are overall means of cv of each dasymetric method and values in the bottom row are overall means of cv ofeach variable

Table 1 Summary of county-level coefficients of variation of RMS error measures

Method Mean CY 95 Confidence IntervalP3 (limiting) 051 043 - 059P2 (3-c1ass) 065 055 - 074Pl (binary) 067 058 - 077

G2 (3-c1ass) 068 058 - 079Gl (binary) 073 064 - 082

tionships error maps will be presented allowingvisual analysis of spatial error patterns Presenta-tion of this type of data in tandem with the dasy-metric maps may generate hypotheses and ideasfor future research

Example Map SeriesFigures45 and 6 present (a) dasymetric maps (b)error maps based on percent error and (c) errormaps based on count error for each of the three

P3P3 P2

PlP2 0001 G2Pl 0000 1000G2 0000 0313 0059Gl 0000 0609 1000 1000

sig nificant differences

Table 3 Probabilities for differences in errors for pairs of dasymetricmethods

and method G1 to G2 thethree-class methods werefound to perform slightlybetter than the binary meth-ods though the overall dif-ference was not significantBased on pairwise compar-ison of method PI to G1 cv = coefficient of variation of RMS county errors

and method P2 to G2 poly- Table 2 Estimated mean errors and confidence intervals for dasymetric methodsgon methods were found toproduce slightly more accu-rate maps than grid methods thoughthe difference was not significant

Many of the papers we reviewed for thiswork especially research on areal inter-polation lacked a visual presentation ofresults Papers with maps (eg Langfordet al 1990 Fisher and Langford 1996Charpentier 1997) typically focused onsmall study areas We now present dasymetricmaps resulting from the three polygon methodstested in this experiment (see also Eicher (1999)for corresponding grid-based maps)

Some of the work in areal interpolation soughtto describe the types of error produced (Cockingset al 1997 Langford et al 1991) For examplethese studies formally tested the connections ofgeometry and attribute characteristics to map errorWhile this section will not formally test these rela-

Visual Analysis of DasymetricMaps and Errors

132 CartograPhy and GeograPhic Information Science

polygon mapping methods The dasymetric mapsare density representations of the total populationvariable Boundaries for all dasymetric zones areshown on these maps Percent error maps showerror relative to the total population of a polygonOne possible weakness of percent error maps isthat rural areas generally will have higher percenterrors because of lower totals within these zonesFurthermore when presented in map form thesehigher errors dominate because rural map zonestend to be larger than urban map zones

Count errors are presented to provide a viewof the absolute error present across the study area(map C in Figures 4 5 and 6) Maps of count errorwill not signal high errors in rural areas with lowpopulations as seen on the percent error mapsWhen count errors are mapped the reader getsmore of an impression of the overall quantity ofthe variable that has been mapped inaccurately Amajor weakness of presenting count data with arealfills however is that larger polygons will tend tohave higher values than smaller polygons with sim-ilar characteristics Despite this recognized prob-lem with mapping counts consistent symbolizationwas maintained to aid comparison with the othertwo map types (a and b) within the figures

Data ranges for classifications are sharedamong corresponding maps in Figures 4 5 and 6to aid comparison The extreme classeshave variedmaximums because the calculated maximum pop-ulation density varied widely depending on themapping method Note that adjacent classes in leg-ends share data breaks namely the upper valuefor the first class (25 personskm2) is the same asthe lower value for the second class the uppervalue for the second class (75 personskm2) is thesame as the lower value for the third class etcEach class should be interpreted as encompassingall values up to but not including the highest valuelisted in the legend for the class This method ofdescribing class breaks allows precise definition ofbreaks without the use of excessiveprecision in sig-nificant digits that clutter a legend and make mapuse difficult (Monmonier 1982)

Binary Method (Pl)The dasymetric map in Figure 4a clearly shows thebinary distinction made in this method betweeninhabitable and uninhabitable areas All areas des-ignated as forested (a large portion of the map)have been assigned zero population The distinc-tion is most obvious in the ridge-and-valley phys-iographic province running diagonally across thestudy region where ridges appear as white (for-ested areas that are assigned no population) and

11Jl 28 No2

valleys are yellow-greens (populated urban andagriculturalwoodland areas)

The percent error map (Figure 4b) as expectedshows under-prediction (grays) for the forestedareas although urban areas have also been under-predicted The latter occurs because urban areasare not distinguished from agriculturalwoodlandareas by the method receiving the same populationdensity when they are in the same county Manyagriculturalwoodland areas are over-predicted forthe same reason especially in areas boundingmajor metropolitan areas (ashington BaltimorePittsburgh) The count error map (Figure 4c) showsslightly different patterns of error Most interest-ing are the relatively low errors in rural forestedareas Though all these areas have -100 percenterror (because they were assigned zero popula-tion) counts are comparatively low because popu-lations are low

Three-Class Method (P2)The dasymetric map in Figure 5a shows that nearlyall areas received population with the exception ofwater The map also distinguishes between urbanand agriculturalwoodland and thus urban placeslike York Erie and Harrisburg (see Figure 1 forlocations) have visiblyhigher populations than thesurrounding rural areas

The percent error map (Figure 5b) shows thaturban areas are either predicted correctly or under-predicted (eg Pittsburgh Washington Baltimore)This map also shows that forested areas are under-predicted especially in the ridge-and-valley prov-ince Conversely the valleys typically classifiedas agriculturalwoodland are over-predicted Theunder-prediction of forested areas in most countiessuggests that not enough population was assignedto the forested land use in the 70-20-10 percentweighting scheme Again the map of count error(Figure 5c) shows lower errors throughout theridge-and-valley province because populations arelower in the mostly rural area

Limiting Variable Method (P3)The dasymetric map produced using the P3method (Figure 6a) which had the lowest overallerror closely resembles the result from method P2which had the next best error measures Slight dif-ferences in dasymetric map patterns for the twomethods are noticeable in northwest Pennsylvaniaalong the top edge of the study area as well as incentral Virginia along the southern edge of themap The limiting variable method is also similarto method P2 in that few areas of the map havezero population

133

a Population Density b Percent Error c Count Error(persons per km2)

050 to 37233 20000 to 38852

25 to 50 10000 to 20000o to 25

10 to 25 5000 to 1000025 to 75

-10 to 10 -5000 to 5000

75 to 175-25 to -10 -10000 to -5000

175 to 375 -50 to -25 -20000 to -10000

375 to 13551 -100 to -50 -395877 to -20000

Figure 4 Maps for binary method (Pl) a dasymetric population density map b percent error and c counterror in number of persons

The pattern of error for the P3 method (Fig-ures 6b and 6c) is different however from thefirst two methods An interesting radial pattern isevident around both Washington and BaltimoreThese areas are unique because they have large met-ropolitan populations and the municipal boundar-ies of each designate them as independent citiesessentially small counties in this analysis so thesecentral cities are well predicted Immediately out-ward however are over-predicted areas followedby a ring of under-predicted areas at the fringesof the metropolitan areas The boundary betweenover-prediction and under-prediction is the func-tional border between urban and agriculturalwoodland land use The Pittsburgh area has asimilar pattern with a central urban area that iswell predicted surrounded by areas of agriculturalwoodland land use that are under-predicted

134

For major metropolitan areas we concludethat the threshold value for agriculturalwoodlandareas was set too low As a result in the case ofWashington the outermost suburbs (classified asagriculturalwoodland) received the threshold den-sity of 15 persons per km2 while too much datawas then carried over and assigned to more trulysuburban areas (classified as urban) closer to thecity center In the case of Pittsburgh a smallercity suburbs immediately proximate to the citywere classified as agriculturalwoodland assignedthe corresponding agriculturalwoodland densitythreshold and therefore under-predicted The ageof the land-use data could also be a factor inboth cases because areas classified as agriculturalwoodland may have been developed between 1970and 1990

CartograPhy and GeograPhic Information Science

a Population Density b Percent Error c Count Error(persons per km2)

050 to 127028 20000 to 26418

25 to 50 10000 to 20000o to 25

10 to 25 5000 to 1000025 to 75

-10 to 10 -5000 to 5000

75 to 175-25to-10 -10000 to -5000

175 to 375 -50 to -25 -20000 to -10000

375 to 16525064 -100 to -50 -288114 to -20000

Figure 5 Maps for three-class method (P2) a dasymetric population density map b percent error and ccount error in number of persons

Conclusions and Future Research

Research Question A Mapping MethodsThere were significant differences in accuracy forthe five dasymetric mapping methods tested inthis experiment The limiting variable method(P3) produced maps with significantly lower errorthan maps made using the other methods Three-class methods (P2 and G2) produced maps witherrors that were not significantly different thanthe errors of maps produced with binary methods(PI and GI)

Though the traditional limiting variablemethod (P3) is over a half-century old (Wright1936) we have shown that it can be implementedin a GIS to produce accurate dasymetric mapsThe success of this method may in part be due

11il28 No2

to its customized approach Threshold values usedto shift data between zones were based on thedata distribution for each mapped variable Thisapproach contrasts with three-class methods P2and G2 in which the same 70-20-10 percentageweightings were applied to all variables Futurework could examine the effect of using differentthreshold values for the land-use classesAlso impor-tant are methods of determining the P3 thresholdvalues Perhaps the reassignment schemes used forP3 could also be made less arbitrary and morespecific to the geography of each source zone bymaking the thresholds dependent on the land-useareas within each source zone (Mennis 1999) Forexample a threshold of 20 people per square kilo-meter used for agriculturaVwoodland areas couldbe increased to 30 in counties with more than 50

135

percent urban land uses In the future a standard-ized and generalizable decision process with a sta-tistical basis might be developed for this purpose

Research Question B Polygon Methodsversus Grid MethodsBased on a pairwise comparison of polygon andgrid versions of the binary method (PI and GI)and the three-class method (P2 and G2) errors forpolygon dasymetric maps were not significantlydifferent than for grid maps In our implementa-tion the use of grid methods was more convenientWithin the ArcView environment grid (raster) ver-sions of our datasets took up less storage spaceand the grid methods were easier to implementusing Avenue scripting Processing was also fasterin ArcViewbecause shorter scripts were typical ofthe grid methods (Eicher 1999) Grid methodswere developed in the areal interpolation litera-ture but have not been popular for dasymetricmapping Their effectiveness in producing dasy-metric maps in this experiment as well as theirease of implementation suggests that they shouldbe used more frequently In contrast advantagesof the equally accurate polygon methods includegreater flexibility for publication-quality mappingsuch as ease of adjusting line weights to suit differ-ent presentation sizes

Future work could study the effect of grid cellsize on the accuracy of dasymetric maps The rela-tionship between cell size and map scale is espe-cially interesting A research project focusing ongrid dasymetric maps could determine optimalgrid cell sizes for varied map scales For examplewe expect that use of excessively small grid cellsfor a given scale would lead to higher map errorSuch research would also provide an analyticalmethod of balancing increases in accuracy affordedby smaller grid cells with the increased storagerequirements and processing times associated withlarger data files See Eicher (1999) for further dis-cussion of this topic

Research Question C Dasymetric Mapsand Error MapsVisual inspection of dasymetric maps and error mapssupported the results from the statisticalanalysispor-tion of the experiment and offered additional insightinto the usefulness of dasymetric maps It is our opin-ion that the resulting dasymetric maps are visuallyeffective showing more accurate distributions thancounty aggregations provide

Classification (the data ranges represented bymap colors) for dasymetric maps was not closely

136

examined in this work Future work could investi-gate whether classifications that have been shownto work well for choropleth maps also work wellfor dasymetric maps An interesting possibility isthe use of contiguity-based classifications (Cromley1996) which might make dasymetric maps visually

more dasymetric Contiguity-based classificationsconsider differences across boundaries betweenadjacent polygons and this strategy is particularlywell suited for application to boundaries on dasy-metric maps Dasymetric boundaries are moreclosely related to the underlying data distributionthan boundaries on choropleth maps because dasy-metric zones are more internally homogeneous

Many possibilities exist for extending calcula-tion and comparison of error in dasymetric mapsFuture work could include a number of between-county factors or covariates in the repeated-mea-sures ANOVAThese could test the relationshipbetween map error and land-use areas popula-tion densitygeographic location and other factorsAlso different approaches could be taken to evalu-ating the quality of dasymetric maps in an exper-iment A defining property of dasymetric mapsis homogeneity within map zones however thisproperty was not formally tested in this researchIt may be possible to compute the variance foreach map zone using values from the block groupscomprising that zone Variances within each mapzone could then be compared to variances betweenmap zones to establish a measure of dasymetricity(homogeneity within zones combined with distinct-ness of differences between adjacent zones) To fur-ther examine dasymetric mapping methods andthe properties of dasymetric maps a number ofdifferent map zone configurations could be usedFor example maps resulting from different ancil-lary variables such as land use soil types or slopezones could be compared Likewisemaps resultingfrom different source zones such as counties votingdistricts and economic planning areas could becompared

SummaryWith a structured experiment we determinedthat the traditional limiting variable method wasthe most accurate of those tested and that griddasymetric mapping methods produced maps ofsimilar accuracy to their polygonal counterpartsWrights limiting variable concept developed 65years ago has maintained its utility This tradi-tional method has been updated and adapted suc-cessfully to produce dasymetric maps in a GISmapping environment Additionally the availabil-

CartogmPhy and Geogmphic Information Science

ity of GIS technology has made data processingfor dasymetric map production a more pragmaticprocess Work (such as this paper) on dasymetricmapping research on areal interpolation andincreased use of GIS by cartographers are firststeps toward producing a formal set of dasymetricmapping methods that updates George McClearys32-year-old typology GIS software packages mayeventually offer full dasymetric mapping function-ality or at least provide the means to make thesetypes of maps more easily

ACKNOWLEDGEMENTSSpecial thanks must be given to Jeremy Mennisand Donna Peuquet for their comments and con-structive criticism on this research We appreciateTerry Slocums careful edits of the manuscript

REFERENCESAnderson J R 1967 Major land uses in the United

States In US Geological SurveyNational atlas of theUnited States of America U S Geological SurveyWash-ington DC pp 158-9

Bloom L M PJ Pedler and G E Wragg 1996 Imple-mentation of enhanced areal interpolation using Map-info Computers amp Geosciences 22(5) 459-66

Charpentier M 1997 The dasymetric method with com-ments on its use in GIS In 1997 UCGISAnnualAssem-bly amp Summer Retreat The University Consortium forGeographic Information Science June 15-20 BarHarbor Maine

Chrisman N R 1997 Exploring geograPhic informationsystems New York New YorkJohn Wileyamp Sons

Chrisman N R 1998 Rethinking levels of measure-ment for cartography Cartography and GeograPhicInformation Systems 25(4) 231-42

Cockings S P Fisher and M Langford 1997 Param-eterization and visualization of the errors in arealinterpolation GeograPhical Analysis 29(4) 314-328

Cromley R G 1996 A comparison of optimal classifica-tion strategies for choroplethic displays of spatiallyaggregated data International journal of GeograPhicalInformation Systems 10(4) 405-24

DeMers M N 1997 Fundamentals of geograPhic informa-tion systems New York New YorkJohn Wileyamp Sons

Eicher C L 1999 Implementation and evaluation ofdasymetric mapping methods MS Thesis Pennsyl-vania State University

Fisher Pand M Langford 1995 Modeling the errors inareal interpolation between zonal systems by MonteCarlo simulation Environment and Planning A 27211-24

Fisher Pand M Langford 1996 Modeling sensitivity toaccuracy in classified imagery A study of areal inter-polation by dasymetric mapping Professional Geogra-pher 48(3) 299-309

Flowerdew R and M Green 1989 Statistical methodsfor inference between incompatible zonal systemsIn M Goodchild and S Gopal (eds) Accuracy of spa-

Vol 28 No2

tial databases London UK Taylor and Francis pp239-47

Flowerdew R and M Green 1991 Data integrationStatistical methods for transferring data betweenzonal systems In l Masser and M B Blakemore(eds) Handling geographical information Essex UKLongman Scientific amp Technical pp 38-53

Flowerdew R and M Green 1992 Developmentsin areal interpolation methods and GIS Annals ofRegional Science 26 67 -78

Flowerdew R and M Green 1994 Areal interpolationand types of data In S Fotheringham and P Rog-erson (eds) Spatial analysis and GIS London uKTaylor and Francis pp 121-45

Flowerdew R M Green and E Kehris 1991 Usingareal interpolation methods in GIS Papers in RegionalScience The journal of the RSAI 70(3) 303-15

Gerth J 1993 Towards improved spatial analysis withareal units The use of GIS to facilitate the creation ofdasymetric maps Masters Paper Ohio State Univer-sity

Goodchild M F and N S Lam 1980 Areal interpola-tion A variant of the traditional spatial problem Geo-processing I 297-312

Goodchild M F L Anselin and U Deichmann 1993A framework for the areal interpolation of socioeco-nomic data Environment and Planning A 25 383-97

Holloway S R J Schumacher and R Redmond 1996People and place Dasymetric mapping using Arc)Info Missoula Wildlife Spatial Analysis Lab Univer-sity of Montana

Hitt K J 1991 Digital map file of major land uses inthe United States US Geological Survey AccessedOctober 1998 ArcInfo coverage available [romhttpwa ter usgs govG ISmetadatausgswrdllandusehtml

Langford M D J Maguire and D J Unwin 1990Mapping the density of population Continuous sur-face representations as an alternative to choroplethand dasymetric maps Leicester UK MidlandsRegional Research Laboratory University of Leicesterand Loughborough University of Technology UK

Langford M D J Maguire and D J Unwin 1991The Areal interpolation problem Estimating popula-tion using remote sensing in a GIS framework Inl Masser and M B Blakemore (eds) Handling geo-graPhic information Essex UK Longman Scientificamp Technical pp 55-77

Langford M and D J Unwin 1994 Generating andmapping population density surfaces within a geo-graphical information system The Cartographic jour-nal31 21-6

MacEachren A M 1994 Some truth with maps A primeron symbolimtion and design Washington DC Associa-tion of American Geographers

McCleary G FJr 1969 The dasymetric method in the-matic cartography PhD Dissertation University ofWisconsin at Madison

McCleary G FJr 1984 Cartography geography andthe dasymetric method In Proceeding 12th Conferenceof International CartograPhic Association August 6-13Perth Australia 1599-610

137

c Count Error

20000 to 47335

10000 to 20000

5000 to 10000

-5000 to 5000

-10000 to -5000

-20000 to -10000

-382151 to -20000

Figure 6 Maps for traditional limiting variable method (P3) a dasymetric population density map bpercent error and c count error in number of persons

Mennis J 1999 Using GIS spatial statistics and visual-ization to investigate environmental justice Paper pre-sented at the Joint Statistical Meetings American Statisti-cal Association Baltimore Md August 8-12 1999

Monmonier M 1982 Flat laxity optimization androunding in the selection of class intervals Cartograph-ica 19(1) 16-27

Monmonier M and G A Schnell 1984 Land use andland cover data and the mapping of population den-sity International Yearbook of CartograPhy 24 24-9

Moxey A and P Allanson 1994 Areal interpolation ofspatially extensive variables A comparison of alterna-tive techniques International journal of GeograPhicalInformation Systems 8(5) 479-87

138

Robinson A J Morrison P Muehrcke A Kimerlingand S Guptill 1995 Elements of cartograPhy 6th edNew York New York John Wiley amp Sons

Slocum T A 1999 Thematic cartography and visualimtionUpper Saddle River New Jersey Prentice Hall

SPSS Inc 1999 SPSS help venion 90 SPSS Incorpo-rated Chicago Illinois

US Geological Survey 1970National atlas of the United Statesof Ame1Ua us Geological Surveyashington DC

US Geological Survey 1993 National water summary1990-91 Hydrologic events and stream water qualityUS Geological Survey water supply paper 2400 USGeological Survey Washington DC

Wright I K 1936 A method of mapping densities ofpopulation The GeograPhical Review pp 103-10

CartograPhy and Geographic Information Science

Page 3: 50;93. *-88361 -6/ 90-4 (6;09874-;376! (5840506;-;376 … lit...%-:?50;93. *-88361 -6/ "90-4 (6;09874-;376! (5840506;-;376 -6/ &=-4

Figure 1 Dasymetric map zones that result from overlayof county and land-use boundaries in the study area Stateboundaries and example major cities are included forreference

bull Intelligent interpolation An areal interpolationwas performed transforming socio-economicdata from one set of zones (source counties) tonew map zones defined by the intersection ofcounties with land-use polygons This interpola-tion was intelligent because it was guided byuse of ancillary land-use data in contrast to theuse of simple areal weighting

bull Nested data The experiment was performed ata scale that allowed checking of mapped values(derived from county data) against enumeratedvalues (summed from block group data) Useof these nested county and block group dataallowed statistical and visual evaluations of theresulting dasymetric maps and errorsThe experiment wasstructured to answer the follow-

ing three questionsA Are there significant differences in the accuracy of

dasymetric maps produced using the five methodstested in the experiment

B Are there significant differences in the accuracy ofpolygonal dasymetric maps versus those based ongridded data

C On a more qualitative levelwhat is the visualcharac-ter of the dasymetric maps produced for the experi-ment and what are the patterns of error on thesemaps

Vol 28 No2

In the following sections we first describe thestudy area and explain our use ofthe land-use data-set in the mapping process Secondly we describesix census variables from which we made the testset of dasymetric maps Then we outline the gen-eral mapping procedure and explain the five dasy-metric mapping methods tested We tested meth-ods derived from traditional cartographic work aswell as an areal interpolation method not specifi-cally designed to be applied in a cartographic con-text Finally we describe our methods for calculat-ing map errors that were used in the analysis of thetest maps

Study AreaThe study area encompassed 159 counties inparts of four states-Pennsylvania West VirginiaMaryland Virginia-and the District of ColumbiaFigure 1 shows a map of the study area with aselection of major cities and states labeled A largenumber of counties were included to allow forparametric analysis of the results using countiesas observations In Virginia cities are treated asdistinct municipal units and so these smaller poly-gons were included as individual observations inthe analysis The varied character of the 159 coun-ties provided a rigorous demonstration of themethods The chosen study area was also appro-priate because it contained a variety of land-useclasses and spanned a number of major landformregions including the coastal plain ridge-and-val-ley province and Allegheny plateau We expecteddifferent population patterns among these regionsFinally for each mapped variable a wide rangeof values were present within the study area Forexample population density varied widely withinthe study area from high-density urban areassuch as Washington and Baltimore to low-densityrural areas in West Virginia and north-centralPennsylvania Similar variations can be foundfor black population and housing-value variables(Figure 2)

Though it would have been convenient to mapa combination of full states the study areas rect-angular shape lends itself well to compact displayThe study area was also designed to avoid theextremely large metropolitan areas of New York(which might overly skew error summaries) whilestill including major metropolitan areas such asWashington Baltimore Pittsburgh and Richmond(Figure 1) Also the Chesapeake Bay shoreline wasmostly removed to avoid difficulties in overlayingislands and narrow peninsular polygons Lastlypersonal familiarity with the study area allowed usto visually check results for realistic values which

127

15 ID 175

2106

375103424

17510 315

30 10 2231

1410 30

0102

b

251075

a

Total Population(dens ly per km)

~ 01D25

610 14

BI ek Populationdllnslly per 11m

was important in implementing several of the map-ping methods

A necessary element for creating dasymetric mapsfrom choropleth maps is an ancillary dataset Thisinformation is used to assist interpolation of datafrom the original zones (counties) to new mapzones For this experiment land-use data wereused in both polygon and grid form Monmonierand Schnell (1984) also used land-use data toproduce a dasymetric map Our land-use datasetwas obtained from the US Geological Survey inthe form of a digital polygon covering the entirecountry at 17500000 scale These data by Hitt(1991) were created from a map of major landuses (Anderson 1967) included in the NationalAtlas (US Geological Survey 1970) and were com-pared to more recent state-level land-use maps(USGeological Survey 1993) We clipped the datafor the study area using ArcViews GeoprocessingExtension to create approximately 330 polygonsThe grid dataset was created from the polygondataset using ArcViews Convert to Grid opera-tion A grid cell of lkm2 was chosen because we feltit was suitably sized larger than most census blockgroups (used for checking the maps) and smallerthan the dasymetric zones (shown in Figure 1)

Land-use Dataset

Mapped VariablesDasymetric mapping is most commonly applied topopulation density For this experiment six socio-economic variables reflecting different aspects ofpopulation distribution were mapped All datawere obtained from the 1990 US Census Twopopulation variables were chosen total popula-tion and black population Four housing variableswere also mapped counts of housing units valuedat less than $25000 $40-49000 $50-59000and $60-75000 These six-count variables weremapped as densities of people or housing unitsAltogether 30 dasymetric maps were tested sixmaps for each of five methods Our goal was tomake conclusions about the accuracy of the meth-ods using a sample of mapped variables We feltthat testing with a sample would produce strongerresults than relying on differences with a singlevariable which was usual in previous research

The six variables were selected to obtain aseries of original maps that had a variety of spa-tial patterns Figure 2 shows pattern differences oncounty choropleth maps Socio-economic variablesare generally highly correlated and our variablesare no exception Thus there are shared character-

cNumber of HomosValued atl bullthan $25000(deNlty per km)

00210 017

0171D 0 30

031 ID 055

0581D 222

224 ID 4249

Figure 2 Example choropleth county maps of three variablesused in dasymetric map evaluation

isticsin these map patterns Weattempted to selectpatterns that were distinct from the overall popula-tion pattern in at least some areas such as differ-ent clusterings of high values in the north southand through the ridge-and-valley region (runningnortheast to southwest across the study area) Thecorrelation between the density of total popula-

128 Cmtography and GeograPhic 1nformation Science

tion and density of black population was 070The correlations between total population and thefour housing variables ranged from 060 to 079while the correlations between black populationand the housing variables ranged from 027 to047 Among the four housing variables correla-tions ranged from 081 to 099

We selected data that were available for coun-ties and block groups-two nested scales of censusenumeration In the experiment dasymetric mapswere created using the county-level data for themapped variables Data available at the block-group level were then used to check the resultsof the dasymetric mapping methods Over 13000block groups were contained in the study area andwere used to calculate errors for 580 dasymetricmap zones

Dasymetric Mapping MethodsThree core methods from the dasymetric and arealinterpolation literature were tested Two methodswere based on dasymetric mapping research thepolygon three-class method and the traditionallimiting variable method The polygon three-classmethod was used to produce a series of maps in anacademic production cartography lab (Hollowayetal 1996) while the traditional limiting methodhas roots in the 65-year-old heavily cited paperby Wright (1936) The grid binary method whichuses raster land-use data was selected from theareal interpolation literature

The dasymetric methods were vector basedand the areal interpolation method was grid basedTo provide a balanced experiment we tested twoother methods that were adaptations of these previ-ously existing methods a grid three-class method(complementing the polygon three-class version)and a polygon binary method (complementing thegrid binary version) They were developed for thisresearch and have not been tested in the publishedliterature Wefelt it was necessary to test both gridand polygon versions of the methods to be ableto attribute differences in performance to differ-ences in the methods without worrying that thedifferences might be amibutable to data structuresThe limitingvariable method isa polygon dasymetricmethod whichwe did not adapt for grid processing

The following sub-sections describe the fivemethods we tested We pair related grid andpolygon methods in the descriptions listing themethod that originates from the literature beforethe derivative method (eg Gl before PI and P2before G2) Methods were implemented in ArcViewusing Avenue scripting language (see the Appen-dix in Eicher (1999) for scripts)

Vol28 No2

Grid and Polygon Binary Methods (GI and PI)For the grid binary method (Gl) 100 percentof the data in each county was assigned to onlythe urban and agriculturaVwoodland cells derivedfrom the raster land-use ancillary dataset No datawere assigned to the other classes (forested water)hence the label binary The primary advantage ofthis method was its simplicity It was only necessaryto reclassifYthe land-use data into two classes onefor inhabitable areas and another for uninhabit-able areas This subjective decision was dependenton the land-use classification set as well as informa-tion known about the mapped area

The binary method was originally developedfor use as a mapping technique (Langford et a1990 1991 Langford and Unwin 1994) but even-tually was applied in areal interpolation problems(Fisher and Langford 1995 Cockings et al 1997)It represents a specialized form of the limiting vari-able method that McCleary (1969) described Allland uses except urban and agriculturalwoodlandin our implementation are limited to zero forpolygons deemed uninhabitable

The fundamental operations of the G1methodwere adapted to work with polygon (versus grid-ded) land-use data to create method Pl The poly-gon binary method also assigns 100 percent ofthe data in each county to only the urban andagriculturalwoodland areas within the county withpopulation or housing densities recalculated forthese new areas

Polygon and Grid Three-class Methods (P2 and G2)For the polygon three-class method (P2) we useda weighting scheme to assign population or hous-ing data to three different land-use classes withineach county urban agriculturalwoodland and for-ested For counties with all three land-use classespresent 70 percent of the data for each county wasassigned to urban land-use polygons 20 percent toagriculturalwoodland polygons and 10 percent toforested land Water received no data The percent-ages we selected were a subjective decision partlybased on a previous implementation by Hollowayet al (1996) who mapped population density forMissoula County Montana Working with censustracts as their choropleth units they assigned 80percent of each tract population to urban polygons10 percent to open polygons and 5 percent eachto agricultural and forested areas Our percentagebreakdown (70-20-10) was chosen because we esti-mated that a larger percentage of the mapped vari-able would be found in agriculturaVwoodland andforested areas in our study region than for theregion mapped by the Montana researchers For

129

this experiment adjustments in the weighting per-centages were not made for different mapped vari-ables the same scheme was used for all variables

A major weakness of the three-class method isthat it does not account for the area of each partic-ular land use within each county A worst-case sce-nario would result for a county that had only one ortwo small urban polygons These polygons wouldstill receive 70 percent of the data for the entirecounty causing the urban areas in that county tohave unusually high densities and the other land-use areas to have lower densities

Method G2 represents a grid-based implemen-tation of method P2 As with the polygon versionthe method used a weighting scheme to assign datato different land-use classes within each countyFor counties with all three inhabitable land uses70 percent of county data were assigned to urbangrid cells 20 percent to agriculturalwoodland cellsand 10 percent to forested cells The method suf-fers from the same weaknesses as the P2 methodsubjectivity of the weightings and potential prob-lems with very small areas of a particular land usein a county As with Gl a major strength of thegrid three-class method was its easier implementa-tion in Avenue

Limiting Variable Method (P3)The final polygon method is the pure form of thelimiting variable method described by McCleary(1969) Wright (1936) also made use of this con-cept as did Gerth (1993) and Charpentier (1997)in their GIS implementations The first step in thismethod was to assign data by simple areal weight-ing to all inhabitable polygons in each county Forthe total population variable data were assignedso that the three inhabitable land-use types withina county (urban agriculturalwoodland and for-ested polygons) had equal population densitiesAt this step water was limited to zero densityNext we set thresholds of maximum density forparticular land uses and applied these throughoutthe study area For example agriculturalwoodlandareas were limited to 50 people per square kilome-ter and forested areas were assigned a lower thresh-old of 15 people per square kilometer for the totalpopulation variable we mapped degThefinal step inthe mapping process was use of these thresholdvalues to make adjustments to the data distributionwithin each county If a polygon density exceededits threshold it was assigned the threshold densityand the remaining data were removed from thearea These data were distributed evenly to theremaining zones in the county (Gerth 1993)

To decide the upper limits on the densitiesof the mapped variable for agriculturalwoodland

130

and forested land uses we examined data availableat the county level Eighteen counties in the data-set were classed entirely as agriculturalwoodlandand we used these to set thresholds For each ofthe sixvariables the agriculturalwoodland thresh-old was based on the value for the sixth highestcounty (70th percentile) The forested thresholdwas chosen using the same list of counties andbased on the value of the fourth lowest (20th per-centile) countyThis approach resulted in differentthresholds for each variable-a systematic custom-ization necessitated by differing magnitude rangesamong variables Only one county in the studyarea was fully forested yielding insufficient infor-mation for the selection of the maximum densitythreshold for forested polygons

Calculation and Analysis of ErrorMeasuresA varietyof waysof measuring error have been usedin areal interpolation research We chose to followFisherand Langford (1995)in their use of root meansquared (RMS)error and a coefficientof variation todescribe errors in dasymetric zones Other research-ers have used mean percent error and mean abso-lute percent error to summarize error (Goodchildetal 1993)We chose RMS error because it could beapplied to count data (eg total number of people)and waseasily interpreted as a value with the sameunits as the mapped variable

We began by calculating the RMS error foreach county using errors for dasymetric zoneswithin the countyThis calculation had three basicsteps1 Calculate the square of the difference between

the correct and interpolated counts for eachdasymetric zone

2 Calculate the mean of these squared errorswithin a county and

3 Take the square root of the mean to producea summary county error measure the RMSerrorThe correct data values for dasymetric zones

(needed for step 1) were calculated by summingpopulation or housing units for block groupswithin each dasymetric zone The coefficient ofvariation for each county was calculated by divid-ing the RMSerror by the correct county data value(number of people or housing units in 1990)Thisstandardized value allowed for direct comparisonacross the six mapped variables

A repeated-measures analysis of variance(ANOVA)wasused to test for significant differencesbetween mean coefficients of variation among thefive mapping methods Coefficients of variation

Cmtography and GeograPhic Information Science

Land use

D Agricultural

Woodland

Forested

bull Urban

Figure 3 Aggregated land-use data for study area

were the repeated measures compared within coun-ties (159 counties were the subjects in the within-subjects parlance of repeated-measures AiTOVA)

Results and Discussion

Statistical Analysis of ErrorsTable I lists means of coefficients of variation for159 counties for each of 30 maps examined in theanalysis Dasymetric methods are ranked by theright-most column that lists overall means for therows P3 had the smallest mean coefficient of varia-tion followed by P2 PI G2 and G1 This rankorder is consistent between variables with a fewtied coefficients and only a single pair of methodschanging places within anyone column For exam-ple G2 is ranked before PI in the first two housingvariables P3 is ranked first for all of the variables

Vol28 No2

This sample of variables contains more housingvariables than population variables but the con-sistency in ran kings of the dasymetric methodswithin variables indicates that this imbalance inthe sample does not bias the overall results

The ANOVA showed that the dasymetricmethod had a significant effect overall (p = 0000F =] 1164 df = 245]) based on the Greenhouse-Geiser statistic (SPSS Inc 1999) Thus the nullhypothesis that there is no difference in the levelof error among the methods was rejected Table2 lists the mean coefficients of variation for eachmethod (from the right column of Table I) andshows confidence limits calculated around thosemeans Significant differences between pairs ofmean coefficients of variation were tested usingthe Bonferroni method because it accounts formultiple comparisons and has been shown to beuseful for large sample sizes (SPSS Inc) The pair-wise results (Table 3) show that P3 the limitingvariable method (which had the lowest error for allvariables) was significantly different from all othermethods Thus the limiting variable method (P3)performed well in comparison to the other meth-ods

Although no other pairings of methods showedsignificant differences the three-class methodsperformed slightly better than the correspondingbinary methods (P2 was better than PI and G2 wasbetter than Gl) Though this difference was notsignificant in our work (Table 3) the result was notconsistent with previous research that showed sim-pler methods performing better than more com-plex versions (see Eicher 1999 p 102)

For a straightforward comparison of errors weadapted GI and P2 to the alternative data struc-ture (G1 to polygon-based PI and P2 to grid-basedG2) This redundancy was important because thegrid methods produced generally higher errorsthat might have confounded comparison (eg G2had lower errors than GI but higher error thanPI) This approach also allowed us to examinewhether there was a significant difference betweenmethods that use polygon land-use data and thosemethods that use grid land-use data Methods PIand GI and methods P2 and G2 were directly com-parable PI had lower error than Gland P2 hadlower error than G2 though in both cases the dif-ferences were not significant (Table 3)

In sum the results of the repeated-measuresANOVA supported three conclusions The tradi-tional limiting variable method (P3) produceddasymetric maps with significantly lower meanerror than maps produced using all other methodsBased on pairwise comparison of method PI to P2

131

House ValuesOverall mean ofDasymetric Total Black

Method Population Population $40000middot $50000middot $60000middot CY for method

Ilt$25000 49000 59000 75000

IP3 047 064 054 049 045 049 051

(limiting)P2 062 070 065 064 063 064 065

(3-class) -Pl 062 075 071 067 064 064 067

lbinary)G2 064 080 067 066 065 066 068(3-c1ass) -Gl 063 090 072 068 065 080 073(binary)

Overall mean of 060 076 066 063 061 065 065cv tor variableNotes Entries in the center portion of the table are means of 159 county coefficients of variation (cv) for 30 individual maps Values inthe right-most column are overall means of cv of each dasymetric method and values in the bottom row are overall means of cv ofeach variable

Table 1 Summary of county-level coefficients of variation of RMS error measures

Method Mean CY 95 Confidence IntervalP3 (limiting) 051 043 - 059P2 (3-c1ass) 065 055 - 074Pl (binary) 067 058 - 077

G2 (3-c1ass) 068 058 - 079Gl (binary) 073 064 - 082

tionships error maps will be presented allowingvisual analysis of spatial error patterns Presenta-tion of this type of data in tandem with the dasy-metric maps may generate hypotheses and ideasfor future research

Example Map SeriesFigures45 and 6 present (a) dasymetric maps (b)error maps based on percent error and (c) errormaps based on count error for each of the three

P3P3 P2

PlP2 0001 G2Pl 0000 1000G2 0000 0313 0059Gl 0000 0609 1000 1000

sig nificant differences

Table 3 Probabilities for differences in errors for pairs of dasymetricmethods

and method G1 to G2 thethree-class methods werefound to perform slightlybetter than the binary meth-ods though the overall dif-ference was not significantBased on pairwise compar-ison of method PI to G1 cv = coefficient of variation of RMS county errors

and method P2 to G2 poly- Table 2 Estimated mean errors and confidence intervals for dasymetric methodsgon methods were found toproduce slightly more accu-rate maps than grid methods thoughthe difference was not significant

Many of the papers we reviewed for thiswork especially research on areal inter-polation lacked a visual presentation ofresults Papers with maps (eg Langfordet al 1990 Fisher and Langford 1996Charpentier 1997) typically focused onsmall study areas We now present dasymetricmaps resulting from the three polygon methodstested in this experiment (see also Eicher (1999)for corresponding grid-based maps)

Some of the work in areal interpolation soughtto describe the types of error produced (Cockingset al 1997 Langford et al 1991) For examplethese studies formally tested the connections ofgeometry and attribute characteristics to map errorWhile this section will not formally test these rela-

Visual Analysis of DasymetricMaps and Errors

132 CartograPhy and GeograPhic Information Science

polygon mapping methods The dasymetric mapsare density representations of the total populationvariable Boundaries for all dasymetric zones areshown on these maps Percent error maps showerror relative to the total population of a polygonOne possible weakness of percent error maps isthat rural areas generally will have higher percenterrors because of lower totals within these zonesFurthermore when presented in map form thesehigher errors dominate because rural map zonestend to be larger than urban map zones

Count errors are presented to provide a viewof the absolute error present across the study area(map C in Figures 4 5 and 6) Maps of count errorwill not signal high errors in rural areas with lowpopulations as seen on the percent error mapsWhen count errors are mapped the reader getsmore of an impression of the overall quantity ofthe variable that has been mapped inaccurately Amajor weakness of presenting count data with arealfills however is that larger polygons will tend tohave higher values than smaller polygons with sim-ilar characteristics Despite this recognized prob-lem with mapping counts consistent symbolizationwas maintained to aid comparison with the othertwo map types (a and b) within the figures

Data ranges for classifications are sharedamong corresponding maps in Figures 4 5 and 6to aid comparison The extreme classeshave variedmaximums because the calculated maximum pop-ulation density varied widely depending on themapping method Note that adjacent classes in leg-ends share data breaks namely the upper valuefor the first class (25 personskm2) is the same asthe lower value for the second class the uppervalue for the second class (75 personskm2) is thesame as the lower value for the third class etcEach class should be interpreted as encompassingall values up to but not including the highest valuelisted in the legend for the class This method ofdescribing class breaks allows precise definition ofbreaks without the use of excessiveprecision in sig-nificant digits that clutter a legend and make mapuse difficult (Monmonier 1982)

Binary Method (Pl)The dasymetric map in Figure 4a clearly shows thebinary distinction made in this method betweeninhabitable and uninhabitable areas All areas des-ignated as forested (a large portion of the map)have been assigned zero population The distinc-tion is most obvious in the ridge-and-valley phys-iographic province running diagonally across thestudy region where ridges appear as white (for-ested areas that are assigned no population) and

11Jl 28 No2

valleys are yellow-greens (populated urban andagriculturalwoodland areas)

The percent error map (Figure 4b) as expectedshows under-prediction (grays) for the forestedareas although urban areas have also been under-predicted The latter occurs because urban areasare not distinguished from agriculturalwoodlandareas by the method receiving the same populationdensity when they are in the same county Manyagriculturalwoodland areas are over-predicted forthe same reason especially in areas boundingmajor metropolitan areas (ashington BaltimorePittsburgh) The count error map (Figure 4c) showsslightly different patterns of error Most interest-ing are the relatively low errors in rural forestedareas Though all these areas have -100 percenterror (because they were assigned zero popula-tion) counts are comparatively low because popu-lations are low

Three-Class Method (P2)The dasymetric map in Figure 5a shows that nearlyall areas received population with the exception ofwater The map also distinguishes between urbanand agriculturalwoodland and thus urban placeslike York Erie and Harrisburg (see Figure 1 forlocations) have visiblyhigher populations than thesurrounding rural areas

The percent error map (Figure 5b) shows thaturban areas are either predicted correctly or under-predicted (eg Pittsburgh Washington Baltimore)This map also shows that forested areas are under-predicted especially in the ridge-and-valley prov-ince Conversely the valleys typically classifiedas agriculturalwoodland are over-predicted Theunder-prediction of forested areas in most countiessuggests that not enough population was assignedto the forested land use in the 70-20-10 percentweighting scheme Again the map of count error(Figure 5c) shows lower errors throughout theridge-and-valley province because populations arelower in the mostly rural area

Limiting Variable Method (P3)The dasymetric map produced using the P3method (Figure 6a) which had the lowest overallerror closely resembles the result from method P2which had the next best error measures Slight dif-ferences in dasymetric map patterns for the twomethods are noticeable in northwest Pennsylvaniaalong the top edge of the study area as well as incentral Virginia along the southern edge of themap The limiting variable method is also similarto method P2 in that few areas of the map havezero population

133

a Population Density b Percent Error c Count Error(persons per km2)

050 to 37233 20000 to 38852

25 to 50 10000 to 20000o to 25

10 to 25 5000 to 1000025 to 75

-10 to 10 -5000 to 5000

75 to 175-25 to -10 -10000 to -5000

175 to 375 -50 to -25 -20000 to -10000

375 to 13551 -100 to -50 -395877 to -20000

Figure 4 Maps for binary method (Pl) a dasymetric population density map b percent error and c counterror in number of persons

The pattern of error for the P3 method (Fig-ures 6b and 6c) is different however from thefirst two methods An interesting radial pattern isevident around both Washington and BaltimoreThese areas are unique because they have large met-ropolitan populations and the municipal boundar-ies of each designate them as independent citiesessentially small counties in this analysis so thesecentral cities are well predicted Immediately out-ward however are over-predicted areas followedby a ring of under-predicted areas at the fringesof the metropolitan areas The boundary betweenover-prediction and under-prediction is the func-tional border between urban and agriculturalwoodland land use The Pittsburgh area has asimilar pattern with a central urban area that iswell predicted surrounded by areas of agriculturalwoodland land use that are under-predicted

134

For major metropolitan areas we concludethat the threshold value for agriculturalwoodlandareas was set too low As a result in the case ofWashington the outermost suburbs (classified asagriculturalwoodland) received the threshold den-sity of 15 persons per km2 while too much datawas then carried over and assigned to more trulysuburban areas (classified as urban) closer to thecity center In the case of Pittsburgh a smallercity suburbs immediately proximate to the citywere classified as agriculturalwoodland assignedthe corresponding agriculturalwoodland densitythreshold and therefore under-predicted The ageof the land-use data could also be a factor inboth cases because areas classified as agriculturalwoodland may have been developed between 1970and 1990

CartograPhy and GeograPhic Information Science

a Population Density b Percent Error c Count Error(persons per km2)

050 to 127028 20000 to 26418

25 to 50 10000 to 20000o to 25

10 to 25 5000 to 1000025 to 75

-10 to 10 -5000 to 5000

75 to 175-25to-10 -10000 to -5000

175 to 375 -50 to -25 -20000 to -10000

375 to 16525064 -100 to -50 -288114 to -20000

Figure 5 Maps for three-class method (P2) a dasymetric population density map b percent error and ccount error in number of persons

Conclusions and Future Research

Research Question A Mapping MethodsThere were significant differences in accuracy forthe five dasymetric mapping methods tested inthis experiment The limiting variable method(P3) produced maps with significantly lower errorthan maps made using the other methods Three-class methods (P2 and G2) produced maps witherrors that were not significantly different thanthe errors of maps produced with binary methods(PI and GI)

Though the traditional limiting variablemethod (P3) is over a half-century old (Wright1936) we have shown that it can be implementedin a GIS to produce accurate dasymetric mapsThe success of this method may in part be due

11il28 No2

to its customized approach Threshold values usedto shift data between zones were based on thedata distribution for each mapped variable Thisapproach contrasts with three-class methods P2and G2 in which the same 70-20-10 percentageweightings were applied to all variables Futurework could examine the effect of using differentthreshold values for the land-use classesAlso impor-tant are methods of determining the P3 thresholdvalues Perhaps the reassignment schemes used forP3 could also be made less arbitrary and morespecific to the geography of each source zone bymaking the thresholds dependent on the land-useareas within each source zone (Mennis 1999) Forexample a threshold of 20 people per square kilo-meter used for agriculturaVwoodland areas couldbe increased to 30 in counties with more than 50

135

percent urban land uses In the future a standard-ized and generalizable decision process with a sta-tistical basis might be developed for this purpose

Research Question B Polygon Methodsversus Grid MethodsBased on a pairwise comparison of polygon andgrid versions of the binary method (PI and GI)and the three-class method (P2 and G2) errors forpolygon dasymetric maps were not significantlydifferent than for grid maps In our implementa-tion the use of grid methods was more convenientWithin the ArcView environment grid (raster) ver-sions of our datasets took up less storage spaceand the grid methods were easier to implementusing Avenue scripting Processing was also fasterin ArcViewbecause shorter scripts were typical ofthe grid methods (Eicher 1999) Grid methodswere developed in the areal interpolation litera-ture but have not been popular for dasymetricmapping Their effectiveness in producing dasy-metric maps in this experiment as well as theirease of implementation suggests that they shouldbe used more frequently In contrast advantagesof the equally accurate polygon methods includegreater flexibility for publication-quality mappingsuch as ease of adjusting line weights to suit differ-ent presentation sizes

Future work could study the effect of grid cellsize on the accuracy of dasymetric maps The rela-tionship between cell size and map scale is espe-cially interesting A research project focusing ongrid dasymetric maps could determine optimalgrid cell sizes for varied map scales For examplewe expect that use of excessively small grid cellsfor a given scale would lead to higher map errorSuch research would also provide an analyticalmethod of balancing increases in accuracy affordedby smaller grid cells with the increased storagerequirements and processing times associated withlarger data files See Eicher (1999) for further dis-cussion of this topic

Research Question C Dasymetric Mapsand Error MapsVisual inspection of dasymetric maps and error mapssupported the results from the statisticalanalysispor-tion of the experiment and offered additional insightinto the usefulness of dasymetric maps It is our opin-ion that the resulting dasymetric maps are visuallyeffective showing more accurate distributions thancounty aggregations provide

Classification (the data ranges represented bymap colors) for dasymetric maps was not closely

136

examined in this work Future work could investi-gate whether classifications that have been shownto work well for choropleth maps also work wellfor dasymetric maps An interesting possibility isthe use of contiguity-based classifications (Cromley1996) which might make dasymetric maps visually

more dasymetric Contiguity-based classificationsconsider differences across boundaries betweenadjacent polygons and this strategy is particularlywell suited for application to boundaries on dasy-metric maps Dasymetric boundaries are moreclosely related to the underlying data distributionthan boundaries on choropleth maps because dasy-metric zones are more internally homogeneous

Many possibilities exist for extending calcula-tion and comparison of error in dasymetric mapsFuture work could include a number of between-county factors or covariates in the repeated-mea-sures ANOVAThese could test the relationshipbetween map error and land-use areas popula-tion densitygeographic location and other factorsAlso different approaches could be taken to evalu-ating the quality of dasymetric maps in an exper-iment A defining property of dasymetric mapsis homogeneity within map zones however thisproperty was not formally tested in this researchIt may be possible to compute the variance foreach map zone using values from the block groupscomprising that zone Variances within each mapzone could then be compared to variances betweenmap zones to establish a measure of dasymetricity(homogeneity within zones combined with distinct-ness of differences between adjacent zones) To fur-ther examine dasymetric mapping methods andthe properties of dasymetric maps a number ofdifferent map zone configurations could be usedFor example maps resulting from different ancil-lary variables such as land use soil types or slopezones could be compared Likewisemaps resultingfrom different source zones such as counties votingdistricts and economic planning areas could becompared

SummaryWith a structured experiment we determinedthat the traditional limiting variable method wasthe most accurate of those tested and that griddasymetric mapping methods produced maps ofsimilar accuracy to their polygonal counterpartsWrights limiting variable concept developed 65years ago has maintained its utility This tradi-tional method has been updated and adapted suc-cessfully to produce dasymetric maps in a GISmapping environment Additionally the availabil-

CartogmPhy and Geogmphic Information Science

ity of GIS technology has made data processingfor dasymetric map production a more pragmaticprocess Work (such as this paper) on dasymetricmapping research on areal interpolation andincreased use of GIS by cartographers are firststeps toward producing a formal set of dasymetricmapping methods that updates George McClearys32-year-old typology GIS software packages mayeventually offer full dasymetric mapping function-ality or at least provide the means to make thesetypes of maps more easily

ACKNOWLEDGEMENTSSpecial thanks must be given to Jeremy Mennisand Donna Peuquet for their comments and con-structive criticism on this research We appreciateTerry Slocums careful edits of the manuscript

REFERENCESAnderson J R 1967 Major land uses in the United

States In US Geological SurveyNational atlas of theUnited States of America U S Geological SurveyWash-ington DC pp 158-9

Bloom L M PJ Pedler and G E Wragg 1996 Imple-mentation of enhanced areal interpolation using Map-info Computers amp Geosciences 22(5) 459-66

Charpentier M 1997 The dasymetric method with com-ments on its use in GIS In 1997 UCGISAnnualAssem-bly amp Summer Retreat The University Consortium forGeographic Information Science June 15-20 BarHarbor Maine

Chrisman N R 1997 Exploring geograPhic informationsystems New York New YorkJohn Wileyamp Sons

Chrisman N R 1998 Rethinking levels of measure-ment for cartography Cartography and GeograPhicInformation Systems 25(4) 231-42

Cockings S P Fisher and M Langford 1997 Param-eterization and visualization of the errors in arealinterpolation GeograPhical Analysis 29(4) 314-328

Cromley R G 1996 A comparison of optimal classifica-tion strategies for choroplethic displays of spatiallyaggregated data International journal of GeograPhicalInformation Systems 10(4) 405-24

DeMers M N 1997 Fundamentals of geograPhic informa-tion systems New York New YorkJohn Wileyamp Sons

Eicher C L 1999 Implementation and evaluation ofdasymetric mapping methods MS Thesis Pennsyl-vania State University

Fisher Pand M Langford 1995 Modeling the errors inareal interpolation between zonal systems by MonteCarlo simulation Environment and Planning A 27211-24

Fisher Pand M Langford 1996 Modeling sensitivity toaccuracy in classified imagery A study of areal inter-polation by dasymetric mapping Professional Geogra-pher 48(3) 299-309

Flowerdew R and M Green 1989 Statistical methodsfor inference between incompatible zonal systemsIn M Goodchild and S Gopal (eds) Accuracy of spa-

Vol 28 No2

tial databases London UK Taylor and Francis pp239-47

Flowerdew R and M Green 1991 Data integrationStatistical methods for transferring data betweenzonal systems In l Masser and M B Blakemore(eds) Handling geographical information Essex UKLongman Scientific amp Technical pp 38-53

Flowerdew R and M Green 1992 Developmentsin areal interpolation methods and GIS Annals ofRegional Science 26 67 -78

Flowerdew R and M Green 1994 Areal interpolationand types of data In S Fotheringham and P Rog-erson (eds) Spatial analysis and GIS London uKTaylor and Francis pp 121-45

Flowerdew R M Green and E Kehris 1991 Usingareal interpolation methods in GIS Papers in RegionalScience The journal of the RSAI 70(3) 303-15

Gerth J 1993 Towards improved spatial analysis withareal units The use of GIS to facilitate the creation ofdasymetric maps Masters Paper Ohio State Univer-sity

Goodchild M F and N S Lam 1980 Areal interpola-tion A variant of the traditional spatial problem Geo-processing I 297-312

Goodchild M F L Anselin and U Deichmann 1993A framework for the areal interpolation of socioeco-nomic data Environment and Planning A 25 383-97

Holloway S R J Schumacher and R Redmond 1996People and place Dasymetric mapping using Arc)Info Missoula Wildlife Spatial Analysis Lab Univer-sity of Montana

Hitt K J 1991 Digital map file of major land uses inthe United States US Geological Survey AccessedOctober 1998 ArcInfo coverage available [romhttpwa ter usgs govG ISmetadatausgswrdllandusehtml

Langford M D J Maguire and D J Unwin 1990Mapping the density of population Continuous sur-face representations as an alternative to choroplethand dasymetric maps Leicester UK MidlandsRegional Research Laboratory University of Leicesterand Loughborough University of Technology UK

Langford M D J Maguire and D J Unwin 1991The Areal interpolation problem Estimating popula-tion using remote sensing in a GIS framework Inl Masser and M B Blakemore (eds) Handling geo-graPhic information Essex UK Longman Scientificamp Technical pp 55-77

Langford M and D J Unwin 1994 Generating andmapping population density surfaces within a geo-graphical information system The Cartographic jour-nal31 21-6

MacEachren A M 1994 Some truth with maps A primeron symbolimtion and design Washington DC Associa-tion of American Geographers

McCleary G FJr 1969 The dasymetric method in the-matic cartography PhD Dissertation University ofWisconsin at Madison

McCleary G FJr 1984 Cartography geography andthe dasymetric method In Proceeding 12th Conferenceof International CartograPhic Association August 6-13Perth Australia 1599-610

137

c Count Error

20000 to 47335

10000 to 20000

5000 to 10000

-5000 to 5000

-10000 to -5000

-20000 to -10000

-382151 to -20000

Figure 6 Maps for traditional limiting variable method (P3) a dasymetric population density map bpercent error and c count error in number of persons

Mennis J 1999 Using GIS spatial statistics and visual-ization to investigate environmental justice Paper pre-sented at the Joint Statistical Meetings American Statisti-cal Association Baltimore Md August 8-12 1999

Monmonier M 1982 Flat laxity optimization androunding in the selection of class intervals Cartograph-ica 19(1) 16-27

Monmonier M and G A Schnell 1984 Land use andland cover data and the mapping of population den-sity International Yearbook of CartograPhy 24 24-9

Moxey A and P Allanson 1994 Areal interpolation ofspatially extensive variables A comparison of alterna-tive techniques International journal of GeograPhicalInformation Systems 8(5) 479-87

138

Robinson A J Morrison P Muehrcke A Kimerlingand S Guptill 1995 Elements of cartograPhy 6th edNew York New York John Wiley amp Sons

Slocum T A 1999 Thematic cartography and visualimtionUpper Saddle River New Jersey Prentice Hall

SPSS Inc 1999 SPSS help venion 90 SPSS Incorpo-rated Chicago Illinois

US Geological Survey 1970National atlas of the United Statesof Ame1Ua us Geological Surveyashington DC

US Geological Survey 1993 National water summary1990-91 Hydrologic events and stream water qualityUS Geological Survey water supply paper 2400 USGeological Survey Washington DC

Wright I K 1936 A method of mapping densities ofpopulation The GeograPhical Review pp 103-10

CartograPhy and Geographic Information Science

Page 4: 50;93. *-88361 -6/ 90-4 (6;09874-;376! (5840506;-;376 … lit...%-:?50;93. *-88361 -6/ "90-4 (6;09874-;376! (5840506;-;376 -6/ &=-4

15 ID 175

2106

375103424

17510 315

30 10 2231

1410 30

0102

b

251075

a

Total Population(dens ly per km)

~ 01D25

610 14

BI ek Populationdllnslly per 11m

was important in implementing several of the map-ping methods

A necessary element for creating dasymetric mapsfrom choropleth maps is an ancillary dataset Thisinformation is used to assist interpolation of datafrom the original zones (counties) to new mapzones For this experiment land-use data wereused in both polygon and grid form Monmonierand Schnell (1984) also used land-use data toproduce a dasymetric map Our land-use datasetwas obtained from the US Geological Survey inthe form of a digital polygon covering the entirecountry at 17500000 scale These data by Hitt(1991) were created from a map of major landuses (Anderson 1967) included in the NationalAtlas (US Geological Survey 1970) and were com-pared to more recent state-level land-use maps(USGeological Survey 1993) We clipped the datafor the study area using ArcViews GeoprocessingExtension to create approximately 330 polygonsThe grid dataset was created from the polygondataset using ArcViews Convert to Grid opera-tion A grid cell of lkm2 was chosen because we feltit was suitably sized larger than most census blockgroups (used for checking the maps) and smallerthan the dasymetric zones (shown in Figure 1)

Land-use Dataset

Mapped VariablesDasymetric mapping is most commonly applied topopulation density For this experiment six socio-economic variables reflecting different aspects ofpopulation distribution were mapped All datawere obtained from the 1990 US Census Twopopulation variables were chosen total popula-tion and black population Four housing variableswere also mapped counts of housing units valuedat less than $25000 $40-49000 $50-59000and $60-75000 These six-count variables weremapped as densities of people or housing unitsAltogether 30 dasymetric maps were tested sixmaps for each of five methods Our goal was tomake conclusions about the accuracy of the meth-ods using a sample of mapped variables We feltthat testing with a sample would produce strongerresults than relying on differences with a singlevariable which was usual in previous research

The six variables were selected to obtain aseries of original maps that had a variety of spa-tial patterns Figure 2 shows pattern differences oncounty choropleth maps Socio-economic variablesare generally highly correlated and our variablesare no exception Thus there are shared character-

cNumber of HomosValued atl bullthan $25000(deNlty per km)

00210 017

0171D 0 30

031 ID 055

0581D 222

224 ID 4249

Figure 2 Example choropleth county maps of three variablesused in dasymetric map evaluation

isticsin these map patterns Weattempted to selectpatterns that were distinct from the overall popula-tion pattern in at least some areas such as differ-ent clusterings of high values in the north southand through the ridge-and-valley region (runningnortheast to southwest across the study area) Thecorrelation between the density of total popula-

128 Cmtography and GeograPhic 1nformation Science

tion and density of black population was 070The correlations between total population and thefour housing variables ranged from 060 to 079while the correlations between black populationand the housing variables ranged from 027 to047 Among the four housing variables correla-tions ranged from 081 to 099

We selected data that were available for coun-ties and block groups-two nested scales of censusenumeration In the experiment dasymetric mapswere created using the county-level data for themapped variables Data available at the block-group level were then used to check the resultsof the dasymetric mapping methods Over 13000block groups were contained in the study area andwere used to calculate errors for 580 dasymetricmap zones

Dasymetric Mapping MethodsThree core methods from the dasymetric and arealinterpolation literature were tested Two methodswere based on dasymetric mapping research thepolygon three-class method and the traditionallimiting variable method The polygon three-classmethod was used to produce a series of maps in anacademic production cartography lab (Hollowayetal 1996) while the traditional limiting methodhas roots in the 65-year-old heavily cited paperby Wright (1936) The grid binary method whichuses raster land-use data was selected from theareal interpolation literature

The dasymetric methods were vector basedand the areal interpolation method was grid basedTo provide a balanced experiment we tested twoother methods that were adaptations of these previ-ously existing methods a grid three-class method(complementing the polygon three-class version)and a polygon binary method (complementing thegrid binary version) They were developed for thisresearch and have not been tested in the publishedliterature Wefelt it was necessary to test both gridand polygon versions of the methods to be ableto attribute differences in performance to differ-ences in the methods without worrying that thedifferences might be amibutable to data structuresThe limitingvariable method isa polygon dasymetricmethod whichwe did not adapt for grid processing

The following sub-sections describe the fivemethods we tested We pair related grid andpolygon methods in the descriptions listing themethod that originates from the literature beforethe derivative method (eg Gl before PI and P2before G2) Methods were implemented in ArcViewusing Avenue scripting language (see the Appen-dix in Eicher (1999) for scripts)

Vol28 No2

Grid and Polygon Binary Methods (GI and PI)For the grid binary method (Gl) 100 percentof the data in each county was assigned to onlythe urban and agriculturaVwoodland cells derivedfrom the raster land-use ancillary dataset No datawere assigned to the other classes (forested water)hence the label binary The primary advantage ofthis method was its simplicity It was only necessaryto reclassifYthe land-use data into two classes onefor inhabitable areas and another for uninhabit-able areas This subjective decision was dependenton the land-use classification set as well as informa-tion known about the mapped area

The binary method was originally developedfor use as a mapping technique (Langford et a1990 1991 Langford and Unwin 1994) but even-tually was applied in areal interpolation problems(Fisher and Langford 1995 Cockings et al 1997)It represents a specialized form of the limiting vari-able method that McCleary (1969) described Allland uses except urban and agriculturalwoodlandin our implementation are limited to zero forpolygons deemed uninhabitable

The fundamental operations of the G1methodwere adapted to work with polygon (versus grid-ded) land-use data to create method Pl The poly-gon binary method also assigns 100 percent ofthe data in each county to only the urban andagriculturalwoodland areas within the county withpopulation or housing densities recalculated forthese new areas

Polygon and Grid Three-class Methods (P2 and G2)For the polygon three-class method (P2) we useda weighting scheme to assign population or hous-ing data to three different land-use classes withineach county urban agriculturalwoodland and for-ested For counties with all three land-use classespresent 70 percent of the data for each county wasassigned to urban land-use polygons 20 percent toagriculturalwoodland polygons and 10 percent toforested land Water received no data The percent-ages we selected were a subjective decision partlybased on a previous implementation by Hollowayet al (1996) who mapped population density forMissoula County Montana Working with censustracts as their choropleth units they assigned 80percent of each tract population to urban polygons10 percent to open polygons and 5 percent eachto agricultural and forested areas Our percentagebreakdown (70-20-10) was chosen because we esti-mated that a larger percentage of the mapped vari-able would be found in agriculturaVwoodland andforested areas in our study region than for theregion mapped by the Montana researchers For

129

this experiment adjustments in the weighting per-centages were not made for different mapped vari-ables the same scheme was used for all variables

A major weakness of the three-class method isthat it does not account for the area of each partic-ular land use within each county A worst-case sce-nario would result for a county that had only one ortwo small urban polygons These polygons wouldstill receive 70 percent of the data for the entirecounty causing the urban areas in that county tohave unusually high densities and the other land-use areas to have lower densities

Method G2 represents a grid-based implemen-tation of method P2 As with the polygon versionthe method used a weighting scheme to assign datato different land-use classes within each countyFor counties with all three inhabitable land uses70 percent of county data were assigned to urbangrid cells 20 percent to agriculturalwoodland cellsand 10 percent to forested cells The method suf-fers from the same weaknesses as the P2 methodsubjectivity of the weightings and potential prob-lems with very small areas of a particular land usein a county As with Gl a major strength of thegrid three-class method was its easier implementa-tion in Avenue

Limiting Variable Method (P3)The final polygon method is the pure form of thelimiting variable method described by McCleary(1969) Wright (1936) also made use of this con-cept as did Gerth (1993) and Charpentier (1997)in their GIS implementations The first step in thismethod was to assign data by simple areal weight-ing to all inhabitable polygons in each county Forthe total population variable data were assignedso that the three inhabitable land-use types withina county (urban agriculturalwoodland and for-ested polygons) had equal population densitiesAt this step water was limited to zero densityNext we set thresholds of maximum density forparticular land uses and applied these throughoutthe study area For example agriculturalwoodlandareas were limited to 50 people per square kilome-ter and forested areas were assigned a lower thresh-old of 15 people per square kilometer for the totalpopulation variable we mapped degThefinal step inthe mapping process was use of these thresholdvalues to make adjustments to the data distributionwithin each county If a polygon density exceededits threshold it was assigned the threshold densityand the remaining data were removed from thearea These data were distributed evenly to theremaining zones in the county (Gerth 1993)

To decide the upper limits on the densitiesof the mapped variable for agriculturalwoodland

130

and forested land uses we examined data availableat the county level Eighteen counties in the data-set were classed entirely as agriculturalwoodlandand we used these to set thresholds For each ofthe sixvariables the agriculturalwoodland thresh-old was based on the value for the sixth highestcounty (70th percentile) The forested thresholdwas chosen using the same list of counties andbased on the value of the fourth lowest (20th per-centile) countyThis approach resulted in differentthresholds for each variable-a systematic custom-ization necessitated by differing magnitude rangesamong variables Only one county in the studyarea was fully forested yielding insufficient infor-mation for the selection of the maximum densitythreshold for forested polygons

Calculation and Analysis of ErrorMeasuresA varietyof waysof measuring error have been usedin areal interpolation research We chose to followFisherand Langford (1995)in their use of root meansquared (RMS)error and a coefficientof variation todescribe errors in dasymetric zones Other research-ers have used mean percent error and mean abso-lute percent error to summarize error (Goodchildetal 1993)We chose RMS error because it could beapplied to count data (eg total number of people)and waseasily interpreted as a value with the sameunits as the mapped variable

We began by calculating the RMS error foreach county using errors for dasymetric zoneswithin the countyThis calculation had three basicsteps1 Calculate the square of the difference between

the correct and interpolated counts for eachdasymetric zone

2 Calculate the mean of these squared errorswithin a county and

3 Take the square root of the mean to producea summary county error measure the RMSerrorThe correct data values for dasymetric zones

(needed for step 1) were calculated by summingpopulation or housing units for block groupswithin each dasymetric zone The coefficient ofvariation for each county was calculated by divid-ing the RMSerror by the correct county data value(number of people or housing units in 1990)Thisstandardized value allowed for direct comparisonacross the six mapped variables

A repeated-measures analysis of variance(ANOVA)wasused to test for significant differencesbetween mean coefficients of variation among thefive mapping methods Coefficients of variation

Cmtography and GeograPhic Information Science

Land use

D Agricultural

Woodland

Forested

bull Urban

Figure 3 Aggregated land-use data for study area

were the repeated measures compared within coun-ties (159 counties were the subjects in the within-subjects parlance of repeated-measures AiTOVA)

Results and Discussion

Statistical Analysis of ErrorsTable I lists means of coefficients of variation for159 counties for each of 30 maps examined in theanalysis Dasymetric methods are ranked by theright-most column that lists overall means for therows P3 had the smallest mean coefficient of varia-tion followed by P2 PI G2 and G1 This rankorder is consistent between variables with a fewtied coefficients and only a single pair of methodschanging places within anyone column For exam-ple G2 is ranked before PI in the first two housingvariables P3 is ranked first for all of the variables

Vol28 No2

This sample of variables contains more housingvariables than population variables but the con-sistency in ran kings of the dasymetric methodswithin variables indicates that this imbalance inthe sample does not bias the overall results

The ANOVA showed that the dasymetricmethod had a significant effect overall (p = 0000F =] 1164 df = 245]) based on the Greenhouse-Geiser statistic (SPSS Inc 1999) Thus the nullhypothesis that there is no difference in the levelof error among the methods was rejected Table2 lists the mean coefficients of variation for eachmethod (from the right column of Table I) andshows confidence limits calculated around thosemeans Significant differences between pairs ofmean coefficients of variation were tested usingthe Bonferroni method because it accounts formultiple comparisons and has been shown to beuseful for large sample sizes (SPSS Inc) The pair-wise results (Table 3) show that P3 the limitingvariable method (which had the lowest error for allvariables) was significantly different from all othermethods Thus the limiting variable method (P3)performed well in comparison to the other meth-ods

Although no other pairings of methods showedsignificant differences the three-class methodsperformed slightly better than the correspondingbinary methods (P2 was better than PI and G2 wasbetter than Gl) Though this difference was notsignificant in our work (Table 3) the result was notconsistent with previous research that showed sim-pler methods performing better than more com-plex versions (see Eicher 1999 p 102)

For a straightforward comparison of errors weadapted GI and P2 to the alternative data struc-ture (G1 to polygon-based PI and P2 to grid-basedG2) This redundancy was important because thegrid methods produced generally higher errorsthat might have confounded comparison (eg G2had lower errors than GI but higher error thanPI) This approach also allowed us to examinewhether there was a significant difference betweenmethods that use polygon land-use data and thosemethods that use grid land-use data Methods PIand GI and methods P2 and G2 were directly com-parable PI had lower error than Gland P2 hadlower error than G2 though in both cases the dif-ferences were not significant (Table 3)

In sum the results of the repeated-measuresANOVA supported three conclusions The tradi-tional limiting variable method (P3) produceddasymetric maps with significantly lower meanerror than maps produced using all other methodsBased on pairwise comparison of method PI to P2

131

House ValuesOverall mean ofDasymetric Total Black

Method Population Population $40000middot $50000middot $60000middot CY for method

Ilt$25000 49000 59000 75000

IP3 047 064 054 049 045 049 051

(limiting)P2 062 070 065 064 063 064 065

(3-class) -Pl 062 075 071 067 064 064 067

lbinary)G2 064 080 067 066 065 066 068(3-c1ass) -Gl 063 090 072 068 065 080 073(binary)

Overall mean of 060 076 066 063 061 065 065cv tor variableNotes Entries in the center portion of the table are means of 159 county coefficients of variation (cv) for 30 individual maps Values inthe right-most column are overall means of cv of each dasymetric method and values in the bottom row are overall means of cv ofeach variable

Table 1 Summary of county-level coefficients of variation of RMS error measures

Method Mean CY 95 Confidence IntervalP3 (limiting) 051 043 - 059P2 (3-c1ass) 065 055 - 074Pl (binary) 067 058 - 077

G2 (3-c1ass) 068 058 - 079Gl (binary) 073 064 - 082

tionships error maps will be presented allowingvisual analysis of spatial error patterns Presenta-tion of this type of data in tandem with the dasy-metric maps may generate hypotheses and ideasfor future research

Example Map SeriesFigures45 and 6 present (a) dasymetric maps (b)error maps based on percent error and (c) errormaps based on count error for each of the three

P3P3 P2

PlP2 0001 G2Pl 0000 1000G2 0000 0313 0059Gl 0000 0609 1000 1000

sig nificant differences

Table 3 Probabilities for differences in errors for pairs of dasymetricmethods

and method G1 to G2 thethree-class methods werefound to perform slightlybetter than the binary meth-ods though the overall dif-ference was not significantBased on pairwise compar-ison of method PI to G1 cv = coefficient of variation of RMS county errors

and method P2 to G2 poly- Table 2 Estimated mean errors and confidence intervals for dasymetric methodsgon methods were found toproduce slightly more accu-rate maps than grid methods thoughthe difference was not significant

Many of the papers we reviewed for thiswork especially research on areal inter-polation lacked a visual presentation ofresults Papers with maps (eg Langfordet al 1990 Fisher and Langford 1996Charpentier 1997) typically focused onsmall study areas We now present dasymetricmaps resulting from the three polygon methodstested in this experiment (see also Eicher (1999)for corresponding grid-based maps)

Some of the work in areal interpolation soughtto describe the types of error produced (Cockingset al 1997 Langford et al 1991) For examplethese studies formally tested the connections ofgeometry and attribute characteristics to map errorWhile this section will not formally test these rela-

Visual Analysis of DasymetricMaps and Errors

132 CartograPhy and GeograPhic Information Science

polygon mapping methods The dasymetric mapsare density representations of the total populationvariable Boundaries for all dasymetric zones areshown on these maps Percent error maps showerror relative to the total population of a polygonOne possible weakness of percent error maps isthat rural areas generally will have higher percenterrors because of lower totals within these zonesFurthermore when presented in map form thesehigher errors dominate because rural map zonestend to be larger than urban map zones

Count errors are presented to provide a viewof the absolute error present across the study area(map C in Figures 4 5 and 6) Maps of count errorwill not signal high errors in rural areas with lowpopulations as seen on the percent error mapsWhen count errors are mapped the reader getsmore of an impression of the overall quantity ofthe variable that has been mapped inaccurately Amajor weakness of presenting count data with arealfills however is that larger polygons will tend tohave higher values than smaller polygons with sim-ilar characteristics Despite this recognized prob-lem with mapping counts consistent symbolizationwas maintained to aid comparison with the othertwo map types (a and b) within the figures

Data ranges for classifications are sharedamong corresponding maps in Figures 4 5 and 6to aid comparison The extreme classeshave variedmaximums because the calculated maximum pop-ulation density varied widely depending on themapping method Note that adjacent classes in leg-ends share data breaks namely the upper valuefor the first class (25 personskm2) is the same asthe lower value for the second class the uppervalue for the second class (75 personskm2) is thesame as the lower value for the third class etcEach class should be interpreted as encompassingall values up to but not including the highest valuelisted in the legend for the class This method ofdescribing class breaks allows precise definition ofbreaks without the use of excessiveprecision in sig-nificant digits that clutter a legend and make mapuse difficult (Monmonier 1982)

Binary Method (Pl)The dasymetric map in Figure 4a clearly shows thebinary distinction made in this method betweeninhabitable and uninhabitable areas All areas des-ignated as forested (a large portion of the map)have been assigned zero population The distinc-tion is most obvious in the ridge-and-valley phys-iographic province running diagonally across thestudy region where ridges appear as white (for-ested areas that are assigned no population) and

11Jl 28 No2

valleys are yellow-greens (populated urban andagriculturalwoodland areas)

The percent error map (Figure 4b) as expectedshows under-prediction (grays) for the forestedareas although urban areas have also been under-predicted The latter occurs because urban areasare not distinguished from agriculturalwoodlandareas by the method receiving the same populationdensity when they are in the same county Manyagriculturalwoodland areas are over-predicted forthe same reason especially in areas boundingmajor metropolitan areas (ashington BaltimorePittsburgh) The count error map (Figure 4c) showsslightly different patterns of error Most interest-ing are the relatively low errors in rural forestedareas Though all these areas have -100 percenterror (because they were assigned zero popula-tion) counts are comparatively low because popu-lations are low

Three-Class Method (P2)The dasymetric map in Figure 5a shows that nearlyall areas received population with the exception ofwater The map also distinguishes between urbanand agriculturalwoodland and thus urban placeslike York Erie and Harrisburg (see Figure 1 forlocations) have visiblyhigher populations than thesurrounding rural areas

The percent error map (Figure 5b) shows thaturban areas are either predicted correctly or under-predicted (eg Pittsburgh Washington Baltimore)This map also shows that forested areas are under-predicted especially in the ridge-and-valley prov-ince Conversely the valleys typically classifiedas agriculturalwoodland are over-predicted Theunder-prediction of forested areas in most countiessuggests that not enough population was assignedto the forested land use in the 70-20-10 percentweighting scheme Again the map of count error(Figure 5c) shows lower errors throughout theridge-and-valley province because populations arelower in the mostly rural area

Limiting Variable Method (P3)The dasymetric map produced using the P3method (Figure 6a) which had the lowest overallerror closely resembles the result from method P2which had the next best error measures Slight dif-ferences in dasymetric map patterns for the twomethods are noticeable in northwest Pennsylvaniaalong the top edge of the study area as well as incentral Virginia along the southern edge of themap The limiting variable method is also similarto method P2 in that few areas of the map havezero population

133

a Population Density b Percent Error c Count Error(persons per km2)

050 to 37233 20000 to 38852

25 to 50 10000 to 20000o to 25

10 to 25 5000 to 1000025 to 75

-10 to 10 -5000 to 5000

75 to 175-25 to -10 -10000 to -5000

175 to 375 -50 to -25 -20000 to -10000

375 to 13551 -100 to -50 -395877 to -20000

Figure 4 Maps for binary method (Pl) a dasymetric population density map b percent error and c counterror in number of persons

The pattern of error for the P3 method (Fig-ures 6b and 6c) is different however from thefirst two methods An interesting radial pattern isevident around both Washington and BaltimoreThese areas are unique because they have large met-ropolitan populations and the municipal boundar-ies of each designate them as independent citiesessentially small counties in this analysis so thesecentral cities are well predicted Immediately out-ward however are over-predicted areas followedby a ring of under-predicted areas at the fringesof the metropolitan areas The boundary betweenover-prediction and under-prediction is the func-tional border between urban and agriculturalwoodland land use The Pittsburgh area has asimilar pattern with a central urban area that iswell predicted surrounded by areas of agriculturalwoodland land use that are under-predicted

134

For major metropolitan areas we concludethat the threshold value for agriculturalwoodlandareas was set too low As a result in the case ofWashington the outermost suburbs (classified asagriculturalwoodland) received the threshold den-sity of 15 persons per km2 while too much datawas then carried over and assigned to more trulysuburban areas (classified as urban) closer to thecity center In the case of Pittsburgh a smallercity suburbs immediately proximate to the citywere classified as agriculturalwoodland assignedthe corresponding agriculturalwoodland densitythreshold and therefore under-predicted The ageof the land-use data could also be a factor inboth cases because areas classified as agriculturalwoodland may have been developed between 1970and 1990

CartograPhy and GeograPhic Information Science

a Population Density b Percent Error c Count Error(persons per km2)

050 to 127028 20000 to 26418

25 to 50 10000 to 20000o to 25

10 to 25 5000 to 1000025 to 75

-10 to 10 -5000 to 5000

75 to 175-25to-10 -10000 to -5000

175 to 375 -50 to -25 -20000 to -10000

375 to 16525064 -100 to -50 -288114 to -20000

Figure 5 Maps for three-class method (P2) a dasymetric population density map b percent error and ccount error in number of persons

Conclusions and Future Research

Research Question A Mapping MethodsThere were significant differences in accuracy forthe five dasymetric mapping methods tested inthis experiment The limiting variable method(P3) produced maps with significantly lower errorthan maps made using the other methods Three-class methods (P2 and G2) produced maps witherrors that were not significantly different thanthe errors of maps produced with binary methods(PI and GI)

Though the traditional limiting variablemethod (P3) is over a half-century old (Wright1936) we have shown that it can be implementedin a GIS to produce accurate dasymetric mapsThe success of this method may in part be due

11il28 No2

to its customized approach Threshold values usedto shift data between zones were based on thedata distribution for each mapped variable Thisapproach contrasts with three-class methods P2and G2 in which the same 70-20-10 percentageweightings were applied to all variables Futurework could examine the effect of using differentthreshold values for the land-use classesAlso impor-tant are methods of determining the P3 thresholdvalues Perhaps the reassignment schemes used forP3 could also be made less arbitrary and morespecific to the geography of each source zone bymaking the thresholds dependent on the land-useareas within each source zone (Mennis 1999) Forexample a threshold of 20 people per square kilo-meter used for agriculturaVwoodland areas couldbe increased to 30 in counties with more than 50

135

percent urban land uses In the future a standard-ized and generalizable decision process with a sta-tistical basis might be developed for this purpose

Research Question B Polygon Methodsversus Grid MethodsBased on a pairwise comparison of polygon andgrid versions of the binary method (PI and GI)and the three-class method (P2 and G2) errors forpolygon dasymetric maps were not significantlydifferent than for grid maps In our implementa-tion the use of grid methods was more convenientWithin the ArcView environment grid (raster) ver-sions of our datasets took up less storage spaceand the grid methods were easier to implementusing Avenue scripting Processing was also fasterin ArcViewbecause shorter scripts were typical ofthe grid methods (Eicher 1999) Grid methodswere developed in the areal interpolation litera-ture but have not been popular for dasymetricmapping Their effectiveness in producing dasy-metric maps in this experiment as well as theirease of implementation suggests that they shouldbe used more frequently In contrast advantagesof the equally accurate polygon methods includegreater flexibility for publication-quality mappingsuch as ease of adjusting line weights to suit differ-ent presentation sizes

Future work could study the effect of grid cellsize on the accuracy of dasymetric maps The rela-tionship between cell size and map scale is espe-cially interesting A research project focusing ongrid dasymetric maps could determine optimalgrid cell sizes for varied map scales For examplewe expect that use of excessively small grid cellsfor a given scale would lead to higher map errorSuch research would also provide an analyticalmethod of balancing increases in accuracy affordedby smaller grid cells with the increased storagerequirements and processing times associated withlarger data files See Eicher (1999) for further dis-cussion of this topic

Research Question C Dasymetric Mapsand Error MapsVisual inspection of dasymetric maps and error mapssupported the results from the statisticalanalysispor-tion of the experiment and offered additional insightinto the usefulness of dasymetric maps It is our opin-ion that the resulting dasymetric maps are visuallyeffective showing more accurate distributions thancounty aggregations provide

Classification (the data ranges represented bymap colors) for dasymetric maps was not closely

136

examined in this work Future work could investi-gate whether classifications that have been shownto work well for choropleth maps also work wellfor dasymetric maps An interesting possibility isthe use of contiguity-based classifications (Cromley1996) which might make dasymetric maps visually

more dasymetric Contiguity-based classificationsconsider differences across boundaries betweenadjacent polygons and this strategy is particularlywell suited for application to boundaries on dasy-metric maps Dasymetric boundaries are moreclosely related to the underlying data distributionthan boundaries on choropleth maps because dasy-metric zones are more internally homogeneous

Many possibilities exist for extending calcula-tion and comparison of error in dasymetric mapsFuture work could include a number of between-county factors or covariates in the repeated-mea-sures ANOVAThese could test the relationshipbetween map error and land-use areas popula-tion densitygeographic location and other factorsAlso different approaches could be taken to evalu-ating the quality of dasymetric maps in an exper-iment A defining property of dasymetric mapsis homogeneity within map zones however thisproperty was not formally tested in this researchIt may be possible to compute the variance foreach map zone using values from the block groupscomprising that zone Variances within each mapzone could then be compared to variances betweenmap zones to establish a measure of dasymetricity(homogeneity within zones combined with distinct-ness of differences between adjacent zones) To fur-ther examine dasymetric mapping methods andthe properties of dasymetric maps a number ofdifferent map zone configurations could be usedFor example maps resulting from different ancil-lary variables such as land use soil types or slopezones could be compared Likewisemaps resultingfrom different source zones such as counties votingdistricts and economic planning areas could becompared

SummaryWith a structured experiment we determinedthat the traditional limiting variable method wasthe most accurate of those tested and that griddasymetric mapping methods produced maps ofsimilar accuracy to their polygonal counterpartsWrights limiting variable concept developed 65years ago has maintained its utility This tradi-tional method has been updated and adapted suc-cessfully to produce dasymetric maps in a GISmapping environment Additionally the availabil-

CartogmPhy and Geogmphic Information Science

ity of GIS technology has made data processingfor dasymetric map production a more pragmaticprocess Work (such as this paper) on dasymetricmapping research on areal interpolation andincreased use of GIS by cartographers are firststeps toward producing a formal set of dasymetricmapping methods that updates George McClearys32-year-old typology GIS software packages mayeventually offer full dasymetric mapping function-ality or at least provide the means to make thesetypes of maps more easily

ACKNOWLEDGEMENTSSpecial thanks must be given to Jeremy Mennisand Donna Peuquet for their comments and con-structive criticism on this research We appreciateTerry Slocums careful edits of the manuscript

REFERENCESAnderson J R 1967 Major land uses in the United

States In US Geological SurveyNational atlas of theUnited States of America U S Geological SurveyWash-ington DC pp 158-9

Bloom L M PJ Pedler and G E Wragg 1996 Imple-mentation of enhanced areal interpolation using Map-info Computers amp Geosciences 22(5) 459-66

Charpentier M 1997 The dasymetric method with com-ments on its use in GIS In 1997 UCGISAnnualAssem-bly amp Summer Retreat The University Consortium forGeographic Information Science June 15-20 BarHarbor Maine

Chrisman N R 1997 Exploring geograPhic informationsystems New York New YorkJohn Wileyamp Sons

Chrisman N R 1998 Rethinking levels of measure-ment for cartography Cartography and GeograPhicInformation Systems 25(4) 231-42

Cockings S P Fisher and M Langford 1997 Param-eterization and visualization of the errors in arealinterpolation GeograPhical Analysis 29(4) 314-328

Cromley R G 1996 A comparison of optimal classifica-tion strategies for choroplethic displays of spatiallyaggregated data International journal of GeograPhicalInformation Systems 10(4) 405-24

DeMers M N 1997 Fundamentals of geograPhic informa-tion systems New York New YorkJohn Wileyamp Sons

Eicher C L 1999 Implementation and evaluation ofdasymetric mapping methods MS Thesis Pennsyl-vania State University

Fisher Pand M Langford 1995 Modeling the errors inareal interpolation between zonal systems by MonteCarlo simulation Environment and Planning A 27211-24

Fisher Pand M Langford 1996 Modeling sensitivity toaccuracy in classified imagery A study of areal inter-polation by dasymetric mapping Professional Geogra-pher 48(3) 299-309

Flowerdew R and M Green 1989 Statistical methodsfor inference between incompatible zonal systemsIn M Goodchild and S Gopal (eds) Accuracy of spa-

Vol 28 No2

tial databases London UK Taylor and Francis pp239-47

Flowerdew R and M Green 1991 Data integrationStatistical methods for transferring data betweenzonal systems In l Masser and M B Blakemore(eds) Handling geographical information Essex UKLongman Scientific amp Technical pp 38-53

Flowerdew R and M Green 1992 Developmentsin areal interpolation methods and GIS Annals ofRegional Science 26 67 -78

Flowerdew R and M Green 1994 Areal interpolationand types of data In S Fotheringham and P Rog-erson (eds) Spatial analysis and GIS London uKTaylor and Francis pp 121-45

Flowerdew R M Green and E Kehris 1991 Usingareal interpolation methods in GIS Papers in RegionalScience The journal of the RSAI 70(3) 303-15

Gerth J 1993 Towards improved spatial analysis withareal units The use of GIS to facilitate the creation ofdasymetric maps Masters Paper Ohio State Univer-sity

Goodchild M F and N S Lam 1980 Areal interpola-tion A variant of the traditional spatial problem Geo-processing I 297-312

Goodchild M F L Anselin and U Deichmann 1993A framework for the areal interpolation of socioeco-nomic data Environment and Planning A 25 383-97

Holloway S R J Schumacher and R Redmond 1996People and place Dasymetric mapping using Arc)Info Missoula Wildlife Spatial Analysis Lab Univer-sity of Montana

Hitt K J 1991 Digital map file of major land uses inthe United States US Geological Survey AccessedOctober 1998 ArcInfo coverage available [romhttpwa ter usgs govG ISmetadatausgswrdllandusehtml

Langford M D J Maguire and D J Unwin 1990Mapping the density of population Continuous sur-face representations as an alternative to choroplethand dasymetric maps Leicester UK MidlandsRegional Research Laboratory University of Leicesterand Loughborough University of Technology UK

Langford M D J Maguire and D J Unwin 1991The Areal interpolation problem Estimating popula-tion using remote sensing in a GIS framework Inl Masser and M B Blakemore (eds) Handling geo-graPhic information Essex UK Longman Scientificamp Technical pp 55-77

Langford M and D J Unwin 1994 Generating andmapping population density surfaces within a geo-graphical information system The Cartographic jour-nal31 21-6

MacEachren A M 1994 Some truth with maps A primeron symbolimtion and design Washington DC Associa-tion of American Geographers

McCleary G FJr 1969 The dasymetric method in the-matic cartography PhD Dissertation University ofWisconsin at Madison

McCleary G FJr 1984 Cartography geography andthe dasymetric method In Proceeding 12th Conferenceof International CartograPhic Association August 6-13Perth Australia 1599-610

137

c Count Error

20000 to 47335

10000 to 20000

5000 to 10000

-5000 to 5000

-10000 to -5000

-20000 to -10000

-382151 to -20000

Figure 6 Maps for traditional limiting variable method (P3) a dasymetric population density map bpercent error and c count error in number of persons

Mennis J 1999 Using GIS spatial statistics and visual-ization to investigate environmental justice Paper pre-sented at the Joint Statistical Meetings American Statisti-cal Association Baltimore Md August 8-12 1999

Monmonier M 1982 Flat laxity optimization androunding in the selection of class intervals Cartograph-ica 19(1) 16-27

Monmonier M and G A Schnell 1984 Land use andland cover data and the mapping of population den-sity International Yearbook of CartograPhy 24 24-9

Moxey A and P Allanson 1994 Areal interpolation ofspatially extensive variables A comparison of alterna-tive techniques International journal of GeograPhicalInformation Systems 8(5) 479-87

138

Robinson A J Morrison P Muehrcke A Kimerlingand S Guptill 1995 Elements of cartograPhy 6th edNew York New York John Wiley amp Sons

Slocum T A 1999 Thematic cartography and visualimtionUpper Saddle River New Jersey Prentice Hall

SPSS Inc 1999 SPSS help venion 90 SPSS Incorpo-rated Chicago Illinois

US Geological Survey 1970National atlas of the United Statesof Ame1Ua us Geological Surveyashington DC

US Geological Survey 1993 National water summary1990-91 Hydrologic events and stream water qualityUS Geological Survey water supply paper 2400 USGeological Survey Washington DC

Wright I K 1936 A method of mapping densities ofpopulation The GeograPhical Review pp 103-10

CartograPhy and Geographic Information Science

Page 5: 50;93. *-88361 -6/ 90-4 (6;09874-;376! (5840506;-;376 … lit...%-:?50;93. *-88361 -6/ "90-4 (6;09874-;376! (5840506;-;376 -6/ &=-4

tion and density of black population was 070The correlations between total population and thefour housing variables ranged from 060 to 079while the correlations between black populationand the housing variables ranged from 027 to047 Among the four housing variables correla-tions ranged from 081 to 099

We selected data that were available for coun-ties and block groups-two nested scales of censusenumeration In the experiment dasymetric mapswere created using the county-level data for themapped variables Data available at the block-group level were then used to check the resultsof the dasymetric mapping methods Over 13000block groups were contained in the study area andwere used to calculate errors for 580 dasymetricmap zones

Dasymetric Mapping MethodsThree core methods from the dasymetric and arealinterpolation literature were tested Two methodswere based on dasymetric mapping research thepolygon three-class method and the traditionallimiting variable method The polygon three-classmethod was used to produce a series of maps in anacademic production cartography lab (Hollowayetal 1996) while the traditional limiting methodhas roots in the 65-year-old heavily cited paperby Wright (1936) The grid binary method whichuses raster land-use data was selected from theareal interpolation literature

The dasymetric methods were vector basedand the areal interpolation method was grid basedTo provide a balanced experiment we tested twoother methods that were adaptations of these previ-ously existing methods a grid three-class method(complementing the polygon three-class version)and a polygon binary method (complementing thegrid binary version) They were developed for thisresearch and have not been tested in the publishedliterature Wefelt it was necessary to test both gridand polygon versions of the methods to be ableto attribute differences in performance to differ-ences in the methods without worrying that thedifferences might be amibutable to data structuresThe limitingvariable method isa polygon dasymetricmethod whichwe did not adapt for grid processing

The following sub-sections describe the fivemethods we tested We pair related grid andpolygon methods in the descriptions listing themethod that originates from the literature beforethe derivative method (eg Gl before PI and P2before G2) Methods were implemented in ArcViewusing Avenue scripting language (see the Appen-dix in Eicher (1999) for scripts)

Vol28 No2

Grid and Polygon Binary Methods (GI and PI)For the grid binary method (Gl) 100 percentof the data in each county was assigned to onlythe urban and agriculturaVwoodland cells derivedfrom the raster land-use ancillary dataset No datawere assigned to the other classes (forested water)hence the label binary The primary advantage ofthis method was its simplicity It was only necessaryto reclassifYthe land-use data into two classes onefor inhabitable areas and another for uninhabit-able areas This subjective decision was dependenton the land-use classification set as well as informa-tion known about the mapped area

The binary method was originally developedfor use as a mapping technique (Langford et a1990 1991 Langford and Unwin 1994) but even-tually was applied in areal interpolation problems(Fisher and Langford 1995 Cockings et al 1997)It represents a specialized form of the limiting vari-able method that McCleary (1969) described Allland uses except urban and agriculturalwoodlandin our implementation are limited to zero forpolygons deemed uninhabitable

The fundamental operations of the G1methodwere adapted to work with polygon (versus grid-ded) land-use data to create method Pl The poly-gon binary method also assigns 100 percent ofthe data in each county to only the urban andagriculturalwoodland areas within the county withpopulation or housing densities recalculated forthese new areas

Polygon and Grid Three-class Methods (P2 and G2)For the polygon three-class method (P2) we useda weighting scheme to assign population or hous-ing data to three different land-use classes withineach county urban agriculturalwoodland and for-ested For counties with all three land-use classespresent 70 percent of the data for each county wasassigned to urban land-use polygons 20 percent toagriculturalwoodland polygons and 10 percent toforested land Water received no data The percent-ages we selected were a subjective decision partlybased on a previous implementation by Hollowayet al (1996) who mapped population density forMissoula County Montana Working with censustracts as their choropleth units they assigned 80percent of each tract population to urban polygons10 percent to open polygons and 5 percent eachto agricultural and forested areas Our percentagebreakdown (70-20-10) was chosen because we esti-mated that a larger percentage of the mapped vari-able would be found in agriculturaVwoodland andforested areas in our study region than for theregion mapped by the Montana researchers For

129

this experiment adjustments in the weighting per-centages were not made for different mapped vari-ables the same scheme was used for all variables

A major weakness of the three-class method isthat it does not account for the area of each partic-ular land use within each county A worst-case sce-nario would result for a county that had only one ortwo small urban polygons These polygons wouldstill receive 70 percent of the data for the entirecounty causing the urban areas in that county tohave unusually high densities and the other land-use areas to have lower densities

Method G2 represents a grid-based implemen-tation of method P2 As with the polygon versionthe method used a weighting scheme to assign datato different land-use classes within each countyFor counties with all three inhabitable land uses70 percent of county data were assigned to urbangrid cells 20 percent to agriculturalwoodland cellsand 10 percent to forested cells The method suf-fers from the same weaknesses as the P2 methodsubjectivity of the weightings and potential prob-lems with very small areas of a particular land usein a county As with Gl a major strength of thegrid three-class method was its easier implementa-tion in Avenue

Limiting Variable Method (P3)The final polygon method is the pure form of thelimiting variable method described by McCleary(1969) Wright (1936) also made use of this con-cept as did Gerth (1993) and Charpentier (1997)in their GIS implementations The first step in thismethod was to assign data by simple areal weight-ing to all inhabitable polygons in each county Forthe total population variable data were assignedso that the three inhabitable land-use types withina county (urban agriculturalwoodland and for-ested polygons) had equal population densitiesAt this step water was limited to zero densityNext we set thresholds of maximum density forparticular land uses and applied these throughoutthe study area For example agriculturalwoodlandareas were limited to 50 people per square kilome-ter and forested areas were assigned a lower thresh-old of 15 people per square kilometer for the totalpopulation variable we mapped degThefinal step inthe mapping process was use of these thresholdvalues to make adjustments to the data distributionwithin each county If a polygon density exceededits threshold it was assigned the threshold densityand the remaining data were removed from thearea These data were distributed evenly to theremaining zones in the county (Gerth 1993)

To decide the upper limits on the densitiesof the mapped variable for agriculturalwoodland

130

and forested land uses we examined data availableat the county level Eighteen counties in the data-set were classed entirely as agriculturalwoodlandand we used these to set thresholds For each ofthe sixvariables the agriculturalwoodland thresh-old was based on the value for the sixth highestcounty (70th percentile) The forested thresholdwas chosen using the same list of counties andbased on the value of the fourth lowest (20th per-centile) countyThis approach resulted in differentthresholds for each variable-a systematic custom-ization necessitated by differing magnitude rangesamong variables Only one county in the studyarea was fully forested yielding insufficient infor-mation for the selection of the maximum densitythreshold for forested polygons

Calculation and Analysis of ErrorMeasuresA varietyof waysof measuring error have been usedin areal interpolation research We chose to followFisherand Langford (1995)in their use of root meansquared (RMS)error and a coefficientof variation todescribe errors in dasymetric zones Other research-ers have used mean percent error and mean abso-lute percent error to summarize error (Goodchildetal 1993)We chose RMS error because it could beapplied to count data (eg total number of people)and waseasily interpreted as a value with the sameunits as the mapped variable

We began by calculating the RMS error foreach county using errors for dasymetric zoneswithin the countyThis calculation had three basicsteps1 Calculate the square of the difference between

the correct and interpolated counts for eachdasymetric zone

2 Calculate the mean of these squared errorswithin a county and

3 Take the square root of the mean to producea summary county error measure the RMSerrorThe correct data values for dasymetric zones

(needed for step 1) were calculated by summingpopulation or housing units for block groupswithin each dasymetric zone The coefficient ofvariation for each county was calculated by divid-ing the RMSerror by the correct county data value(number of people or housing units in 1990)Thisstandardized value allowed for direct comparisonacross the six mapped variables

A repeated-measures analysis of variance(ANOVA)wasused to test for significant differencesbetween mean coefficients of variation among thefive mapping methods Coefficients of variation

Cmtography and GeograPhic Information Science

Land use

D Agricultural

Woodland

Forested

bull Urban

Figure 3 Aggregated land-use data for study area

were the repeated measures compared within coun-ties (159 counties were the subjects in the within-subjects parlance of repeated-measures AiTOVA)

Results and Discussion

Statistical Analysis of ErrorsTable I lists means of coefficients of variation for159 counties for each of 30 maps examined in theanalysis Dasymetric methods are ranked by theright-most column that lists overall means for therows P3 had the smallest mean coefficient of varia-tion followed by P2 PI G2 and G1 This rankorder is consistent between variables with a fewtied coefficients and only a single pair of methodschanging places within anyone column For exam-ple G2 is ranked before PI in the first two housingvariables P3 is ranked first for all of the variables

Vol28 No2

This sample of variables contains more housingvariables than population variables but the con-sistency in ran kings of the dasymetric methodswithin variables indicates that this imbalance inthe sample does not bias the overall results

The ANOVA showed that the dasymetricmethod had a significant effect overall (p = 0000F =] 1164 df = 245]) based on the Greenhouse-Geiser statistic (SPSS Inc 1999) Thus the nullhypothesis that there is no difference in the levelof error among the methods was rejected Table2 lists the mean coefficients of variation for eachmethod (from the right column of Table I) andshows confidence limits calculated around thosemeans Significant differences between pairs ofmean coefficients of variation were tested usingthe Bonferroni method because it accounts formultiple comparisons and has been shown to beuseful for large sample sizes (SPSS Inc) The pair-wise results (Table 3) show that P3 the limitingvariable method (which had the lowest error for allvariables) was significantly different from all othermethods Thus the limiting variable method (P3)performed well in comparison to the other meth-ods

Although no other pairings of methods showedsignificant differences the three-class methodsperformed slightly better than the correspondingbinary methods (P2 was better than PI and G2 wasbetter than Gl) Though this difference was notsignificant in our work (Table 3) the result was notconsistent with previous research that showed sim-pler methods performing better than more com-plex versions (see Eicher 1999 p 102)

For a straightforward comparison of errors weadapted GI and P2 to the alternative data struc-ture (G1 to polygon-based PI and P2 to grid-basedG2) This redundancy was important because thegrid methods produced generally higher errorsthat might have confounded comparison (eg G2had lower errors than GI but higher error thanPI) This approach also allowed us to examinewhether there was a significant difference betweenmethods that use polygon land-use data and thosemethods that use grid land-use data Methods PIand GI and methods P2 and G2 were directly com-parable PI had lower error than Gland P2 hadlower error than G2 though in both cases the dif-ferences were not significant (Table 3)

In sum the results of the repeated-measuresANOVA supported three conclusions The tradi-tional limiting variable method (P3) produceddasymetric maps with significantly lower meanerror than maps produced using all other methodsBased on pairwise comparison of method PI to P2

131

House ValuesOverall mean ofDasymetric Total Black

Method Population Population $40000middot $50000middot $60000middot CY for method

Ilt$25000 49000 59000 75000

IP3 047 064 054 049 045 049 051

(limiting)P2 062 070 065 064 063 064 065

(3-class) -Pl 062 075 071 067 064 064 067

lbinary)G2 064 080 067 066 065 066 068(3-c1ass) -Gl 063 090 072 068 065 080 073(binary)

Overall mean of 060 076 066 063 061 065 065cv tor variableNotes Entries in the center portion of the table are means of 159 county coefficients of variation (cv) for 30 individual maps Values inthe right-most column are overall means of cv of each dasymetric method and values in the bottom row are overall means of cv ofeach variable

Table 1 Summary of county-level coefficients of variation of RMS error measures

Method Mean CY 95 Confidence IntervalP3 (limiting) 051 043 - 059P2 (3-c1ass) 065 055 - 074Pl (binary) 067 058 - 077

G2 (3-c1ass) 068 058 - 079Gl (binary) 073 064 - 082

tionships error maps will be presented allowingvisual analysis of spatial error patterns Presenta-tion of this type of data in tandem with the dasy-metric maps may generate hypotheses and ideasfor future research

Example Map SeriesFigures45 and 6 present (a) dasymetric maps (b)error maps based on percent error and (c) errormaps based on count error for each of the three

P3P3 P2

PlP2 0001 G2Pl 0000 1000G2 0000 0313 0059Gl 0000 0609 1000 1000

sig nificant differences

Table 3 Probabilities for differences in errors for pairs of dasymetricmethods

and method G1 to G2 thethree-class methods werefound to perform slightlybetter than the binary meth-ods though the overall dif-ference was not significantBased on pairwise compar-ison of method PI to G1 cv = coefficient of variation of RMS county errors

and method P2 to G2 poly- Table 2 Estimated mean errors and confidence intervals for dasymetric methodsgon methods were found toproduce slightly more accu-rate maps than grid methods thoughthe difference was not significant

Many of the papers we reviewed for thiswork especially research on areal inter-polation lacked a visual presentation ofresults Papers with maps (eg Langfordet al 1990 Fisher and Langford 1996Charpentier 1997) typically focused onsmall study areas We now present dasymetricmaps resulting from the three polygon methodstested in this experiment (see also Eicher (1999)for corresponding grid-based maps)

Some of the work in areal interpolation soughtto describe the types of error produced (Cockingset al 1997 Langford et al 1991) For examplethese studies formally tested the connections ofgeometry and attribute characteristics to map errorWhile this section will not formally test these rela-

Visual Analysis of DasymetricMaps and Errors

132 CartograPhy and GeograPhic Information Science

polygon mapping methods The dasymetric mapsare density representations of the total populationvariable Boundaries for all dasymetric zones areshown on these maps Percent error maps showerror relative to the total population of a polygonOne possible weakness of percent error maps isthat rural areas generally will have higher percenterrors because of lower totals within these zonesFurthermore when presented in map form thesehigher errors dominate because rural map zonestend to be larger than urban map zones

Count errors are presented to provide a viewof the absolute error present across the study area(map C in Figures 4 5 and 6) Maps of count errorwill not signal high errors in rural areas with lowpopulations as seen on the percent error mapsWhen count errors are mapped the reader getsmore of an impression of the overall quantity ofthe variable that has been mapped inaccurately Amajor weakness of presenting count data with arealfills however is that larger polygons will tend tohave higher values than smaller polygons with sim-ilar characteristics Despite this recognized prob-lem with mapping counts consistent symbolizationwas maintained to aid comparison with the othertwo map types (a and b) within the figures

Data ranges for classifications are sharedamong corresponding maps in Figures 4 5 and 6to aid comparison The extreme classeshave variedmaximums because the calculated maximum pop-ulation density varied widely depending on themapping method Note that adjacent classes in leg-ends share data breaks namely the upper valuefor the first class (25 personskm2) is the same asthe lower value for the second class the uppervalue for the second class (75 personskm2) is thesame as the lower value for the third class etcEach class should be interpreted as encompassingall values up to but not including the highest valuelisted in the legend for the class This method ofdescribing class breaks allows precise definition ofbreaks without the use of excessiveprecision in sig-nificant digits that clutter a legend and make mapuse difficult (Monmonier 1982)

Binary Method (Pl)The dasymetric map in Figure 4a clearly shows thebinary distinction made in this method betweeninhabitable and uninhabitable areas All areas des-ignated as forested (a large portion of the map)have been assigned zero population The distinc-tion is most obvious in the ridge-and-valley phys-iographic province running diagonally across thestudy region where ridges appear as white (for-ested areas that are assigned no population) and

11Jl 28 No2

valleys are yellow-greens (populated urban andagriculturalwoodland areas)

The percent error map (Figure 4b) as expectedshows under-prediction (grays) for the forestedareas although urban areas have also been under-predicted The latter occurs because urban areasare not distinguished from agriculturalwoodlandareas by the method receiving the same populationdensity when they are in the same county Manyagriculturalwoodland areas are over-predicted forthe same reason especially in areas boundingmajor metropolitan areas (ashington BaltimorePittsburgh) The count error map (Figure 4c) showsslightly different patterns of error Most interest-ing are the relatively low errors in rural forestedareas Though all these areas have -100 percenterror (because they were assigned zero popula-tion) counts are comparatively low because popu-lations are low

Three-Class Method (P2)The dasymetric map in Figure 5a shows that nearlyall areas received population with the exception ofwater The map also distinguishes between urbanand agriculturalwoodland and thus urban placeslike York Erie and Harrisburg (see Figure 1 forlocations) have visiblyhigher populations than thesurrounding rural areas

The percent error map (Figure 5b) shows thaturban areas are either predicted correctly or under-predicted (eg Pittsburgh Washington Baltimore)This map also shows that forested areas are under-predicted especially in the ridge-and-valley prov-ince Conversely the valleys typically classifiedas agriculturalwoodland are over-predicted Theunder-prediction of forested areas in most countiessuggests that not enough population was assignedto the forested land use in the 70-20-10 percentweighting scheme Again the map of count error(Figure 5c) shows lower errors throughout theridge-and-valley province because populations arelower in the mostly rural area

Limiting Variable Method (P3)The dasymetric map produced using the P3method (Figure 6a) which had the lowest overallerror closely resembles the result from method P2which had the next best error measures Slight dif-ferences in dasymetric map patterns for the twomethods are noticeable in northwest Pennsylvaniaalong the top edge of the study area as well as incentral Virginia along the southern edge of themap The limiting variable method is also similarto method P2 in that few areas of the map havezero population

133

a Population Density b Percent Error c Count Error(persons per km2)

050 to 37233 20000 to 38852

25 to 50 10000 to 20000o to 25

10 to 25 5000 to 1000025 to 75

-10 to 10 -5000 to 5000

75 to 175-25 to -10 -10000 to -5000

175 to 375 -50 to -25 -20000 to -10000

375 to 13551 -100 to -50 -395877 to -20000

Figure 4 Maps for binary method (Pl) a dasymetric population density map b percent error and c counterror in number of persons

The pattern of error for the P3 method (Fig-ures 6b and 6c) is different however from thefirst two methods An interesting radial pattern isevident around both Washington and BaltimoreThese areas are unique because they have large met-ropolitan populations and the municipal boundar-ies of each designate them as independent citiesessentially small counties in this analysis so thesecentral cities are well predicted Immediately out-ward however are over-predicted areas followedby a ring of under-predicted areas at the fringesof the metropolitan areas The boundary betweenover-prediction and under-prediction is the func-tional border between urban and agriculturalwoodland land use The Pittsburgh area has asimilar pattern with a central urban area that iswell predicted surrounded by areas of agriculturalwoodland land use that are under-predicted

134

For major metropolitan areas we concludethat the threshold value for agriculturalwoodlandareas was set too low As a result in the case ofWashington the outermost suburbs (classified asagriculturalwoodland) received the threshold den-sity of 15 persons per km2 while too much datawas then carried over and assigned to more trulysuburban areas (classified as urban) closer to thecity center In the case of Pittsburgh a smallercity suburbs immediately proximate to the citywere classified as agriculturalwoodland assignedthe corresponding agriculturalwoodland densitythreshold and therefore under-predicted The ageof the land-use data could also be a factor inboth cases because areas classified as agriculturalwoodland may have been developed between 1970and 1990

CartograPhy and GeograPhic Information Science

a Population Density b Percent Error c Count Error(persons per km2)

050 to 127028 20000 to 26418

25 to 50 10000 to 20000o to 25

10 to 25 5000 to 1000025 to 75

-10 to 10 -5000 to 5000

75 to 175-25to-10 -10000 to -5000

175 to 375 -50 to -25 -20000 to -10000

375 to 16525064 -100 to -50 -288114 to -20000

Figure 5 Maps for three-class method (P2) a dasymetric population density map b percent error and ccount error in number of persons

Conclusions and Future Research

Research Question A Mapping MethodsThere were significant differences in accuracy forthe five dasymetric mapping methods tested inthis experiment The limiting variable method(P3) produced maps with significantly lower errorthan maps made using the other methods Three-class methods (P2 and G2) produced maps witherrors that were not significantly different thanthe errors of maps produced with binary methods(PI and GI)

Though the traditional limiting variablemethod (P3) is over a half-century old (Wright1936) we have shown that it can be implementedin a GIS to produce accurate dasymetric mapsThe success of this method may in part be due

11il28 No2

to its customized approach Threshold values usedto shift data between zones were based on thedata distribution for each mapped variable Thisapproach contrasts with three-class methods P2and G2 in which the same 70-20-10 percentageweightings were applied to all variables Futurework could examine the effect of using differentthreshold values for the land-use classesAlso impor-tant are methods of determining the P3 thresholdvalues Perhaps the reassignment schemes used forP3 could also be made less arbitrary and morespecific to the geography of each source zone bymaking the thresholds dependent on the land-useareas within each source zone (Mennis 1999) Forexample a threshold of 20 people per square kilo-meter used for agriculturaVwoodland areas couldbe increased to 30 in counties with more than 50

135

percent urban land uses In the future a standard-ized and generalizable decision process with a sta-tistical basis might be developed for this purpose

Research Question B Polygon Methodsversus Grid MethodsBased on a pairwise comparison of polygon andgrid versions of the binary method (PI and GI)and the three-class method (P2 and G2) errors forpolygon dasymetric maps were not significantlydifferent than for grid maps In our implementa-tion the use of grid methods was more convenientWithin the ArcView environment grid (raster) ver-sions of our datasets took up less storage spaceand the grid methods were easier to implementusing Avenue scripting Processing was also fasterin ArcViewbecause shorter scripts were typical ofthe grid methods (Eicher 1999) Grid methodswere developed in the areal interpolation litera-ture but have not been popular for dasymetricmapping Their effectiveness in producing dasy-metric maps in this experiment as well as theirease of implementation suggests that they shouldbe used more frequently In contrast advantagesof the equally accurate polygon methods includegreater flexibility for publication-quality mappingsuch as ease of adjusting line weights to suit differ-ent presentation sizes

Future work could study the effect of grid cellsize on the accuracy of dasymetric maps The rela-tionship between cell size and map scale is espe-cially interesting A research project focusing ongrid dasymetric maps could determine optimalgrid cell sizes for varied map scales For examplewe expect that use of excessively small grid cellsfor a given scale would lead to higher map errorSuch research would also provide an analyticalmethod of balancing increases in accuracy affordedby smaller grid cells with the increased storagerequirements and processing times associated withlarger data files See Eicher (1999) for further dis-cussion of this topic

Research Question C Dasymetric Mapsand Error MapsVisual inspection of dasymetric maps and error mapssupported the results from the statisticalanalysispor-tion of the experiment and offered additional insightinto the usefulness of dasymetric maps It is our opin-ion that the resulting dasymetric maps are visuallyeffective showing more accurate distributions thancounty aggregations provide

Classification (the data ranges represented bymap colors) for dasymetric maps was not closely

136

examined in this work Future work could investi-gate whether classifications that have been shownto work well for choropleth maps also work wellfor dasymetric maps An interesting possibility isthe use of contiguity-based classifications (Cromley1996) which might make dasymetric maps visually

more dasymetric Contiguity-based classificationsconsider differences across boundaries betweenadjacent polygons and this strategy is particularlywell suited for application to boundaries on dasy-metric maps Dasymetric boundaries are moreclosely related to the underlying data distributionthan boundaries on choropleth maps because dasy-metric zones are more internally homogeneous

Many possibilities exist for extending calcula-tion and comparison of error in dasymetric mapsFuture work could include a number of between-county factors or covariates in the repeated-mea-sures ANOVAThese could test the relationshipbetween map error and land-use areas popula-tion densitygeographic location and other factorsAlso different approaches could be taken to evalu-ating the quality of dasymetric maps in an exper-iment A defining property of dasymetric mapsis homogeneity within map zones however thisproperty was not formally tested in this researchIt may be possible to compute the variance foreach map zone using values from the block groupscomprising that zone Variances within each mapzone could then be compared to variances betweenmap zones to establish a measure of dasymetricity(homogeneity within zones combined with distinct-ness of differences between adjacent zones) To fur-ther examine dasymetric mapping methods andthe properties of dasymetric maps a number ofdifferent map zone configurations could be usedFor example maps resulting from different ancil-lary variables such as land use soil types or slopezones could be compared Likewisemaps resultingfrom different source zones such as counties votingdistricts and economic planning areas could becompared

SummaryWith a structured experiment we determinedthat the traditional limiting variable method wasthe most accurate of those tested and that griddasymetric mapping methods produced maps ofsimilar accuracy to their polygonal counterpartsWrights limiting variable concept developed 65years ago has maintained its utility This tradi-tional method has been updated and adapted suc-cessfully to produce dasymetric maps in a GISmapping environment Additionally the availabil-

CartogmPhy and Geogmphic Information Science

ity of GIS technology has made data processingfor dasymetric map production a more pragmaticprocess Work (such as this paper) on dasymetricmapping research on areal interpolation andincreased use of GIS by cartographers are firststeps toward producing a formal set of dasymetricmapping methods that updates George McClearys32-year-old typology GIS software packages mayeventually offer full dasymetric mapping function-ality or at least provide the means to make thesetypes of maps more easily

ACKNOWLEDGEMENTSSpecial thanks must be given to Jeremy Mennisand Donna Peuquet for their comments and con-structive criticism on this research We appreciateTerry Slocums careful edits of the manuscript

REFERENCESAnderson J R 1967 Major land uses in the United

States In US Geological SurveyNational atlas of theUnited States of America U S Geological SurveyWash-ington DC pp 158-9

Bloom L M PJ Pedler and G E Wragg 1996 Imple-mentation of enhanced areal interpolation using Map-info Computers amp Geosciences 22(5) 459-66

Charpentier M 1997 The dasymetric method with com-ments on its use in GIS In 1997 UCGISAnnualAssem-bly amp Summer Retreat The University Consortium forGeographic Information Science June 15-20 BarHarbor Maine

Chrisman N R 1997 Exploring geograPhic informationsystems New York New YorkJohn Wileyamp Sons

Chrisman N R 1998 Rethinking levels of measure-ment for cartography Cartography and GeograPhicInformation Systems 25(4) 231-42

Cockings S P Fisher and M Langford 1997 Param-eterization and visualization of the errors in arealinterpolation GeograPhical Analysis 29(4) 314-328

Cromley R G 1996 A comparison of optimal classifica-tion strategies for choroplethic displays of spatiallyaggregated data International journal of GeograPhicalInformation Systems 10(4) 405-24

DeMers M N 1997 Fundamentals of geograPhic informa-tion systems New York New YorkJohn Wileyamp Sons

Eicher C L 1999 Implementation and evaluation ofdasymetric mapping methods MS Thesis Pennsyl-vania State University

Fisher Pand M Langford 1995 Modeling the errors inareal interpolation between zonal systems by MonteCarlo simulation Environment and Planning A 27211-24

Fisher Pand M Langford 1996 Modeling sensitivity toaccuracy in classified imagery A study of areal inter-polation by dasymetric mapping Professional Geogra-pher 48(3) 299-309

Flowerdew R and M Green 1989 Statistical methodsfor inference between incompatible zonal systemsIn M Goodchild and S Gopal (eds) Accuracy of spa-

Vol 28 No2

tial databases London UK Taylor and Francis pp239-47

Flowerdew R and M Green 1991 Data integrationStatistical methods for transferring data betweenzonal systems In l Masser and M B Blakemore(eds) Handling geographical information Essex UKLongman Scientific amp Technical pp 38-53

Flowerdew R and M Green 1992 Developmentsin areal interpolation methods and GIS Annals ofRegional Science 26 67 -78

Flowerdew R and M Green 1994 Areal interpolationand types of data In S Fotheringham and P Rog-erson (eds) Spatial analysis and GIS London uKTaylor and Francis pp 121-45

Flowerdew R M Green and E Kehris 1991 Usingareal interpolation methods in GIS Papers in RegionalScience The journal of the RSAI 70(3) 303-15

Gerth J 1993 Towards improved spatial analysis withareal units The use of GIS to facilitate the creation ofdasymetric maps Masters Paper Ohio State Univer-sity

Goodchild M F and N S Lam 1980 Areal interpola-tion A variant of the traditional spatial problem Geo-processing I 297-312

Goodchild M F L Anselin and U Deichmann 1993A framework for the areal interpolation of socioeco-nomic data Environment and Planning A 25 383-97

Holloway S R J Schumacher and R Redmond 1996People and place Dasymetric mapping using Arc)Info Missoula Wildlife Spatial Analysis Lab Univer-sity of Montana

Hitt K J 1991 Digital map file of major land uses inthe United States US Geological Survey AccessedOctober 1998 ArcInfo coverage available [romhttpwa ter usgs govG ISmetadatausgswrdllandusehtml

Langford M D J Maguire and D J Unwin 1990Mapping the density of population Continuous sur-face representations as an alternative to choroplethand dasymetric maps Leicester UK MidlandsRegional Research Laboratory University of Leicesterand Loughborough University of Technology UK

Langford M D J Maguire and D J Unwin 1991The Areal interpolation problem Estimating popula-tion using remote sensing in a GIS framework Inl Masser and M B Blakemore (eds) Handling geo-graPhic information Essex UK Longman Scientificamp Technical pp 55-77

Langford M and D J Unwin 1994 Generating andmapping population density surfaces within a geo-graphical information system The Cartographic jour-nal31 21-6

MacEachren A M 1994 Some truth with maps A primeron symbolimtion and design Washington DC Associa-tion of American Geographers

McCleary G FJr 1969 The dasymetric method in the-matic cartography PhD Dissertation University ofWisconsin at Madison

McCleary G FJr 1984 Cartography geography andthe dasymetric method In Proceeding 12th Conferenceof International CartograPhic Association August 6-13Perth Australia 1599-610

137

c Count Error

20000 to 47335

10000 to 20000

5000 to 10000

-5000 to 5000

-10000 to -5000

-20000 to -10000

-382151 to -20000

Figure 6 Maps for traditional limiting variable method (P3) a dasymetric population density map bpercent error and c count error in number of persons

Mennis J 1999 Using GIS spatial statistics and visual-ization to investigate environmental justice Paper pre-sented at the Joint Statistical Meetings American Statisti-cal Association Baltimore Md August 8-12 1999

Monmonier M 1982 Flat laxity optimization androunding in the selection of class intervals Cartograph-ica 19(1) 16-27

Monmonier M and G A Schnell 1984 Land use andland cover data and the mapping of population den-sity International Yearbook of CartograPhy 24 24-9

Moxey A and P Allanson 1994 Areal interpolation ofspatially extensive variables A comparison of alterna-tive techniques International journal of GeograPhicalInformation Systems 8(5) 479-87

138

Robinson A J Morrison P Muehrcke A Kimerlingand S Guptill 1995 Elements of cartograPhy 6th edNew York New York John Wiley amp Sons

Slocum T A 1999 Thematic cartography and visualimtionUpper Saddle River New Jersey Prentice Hall

SPSS Inc 1999 SPSS help venion 90 SPSS Incorpo-rated Chicago Illinois

US Geological Survey 1970National atlas of the United Statesof Ame1Ua us Geological Surveyashington DC

US Geological Survey 1993 National water summary1990-91 Hydrologic events and stream water qualityUS Geological Survey water supply paper 2400 USGeological Survey Washington DC

Wright I K 1936 A method of mapping densities ofpopulation The GeograPhical Review pp 103-10

CartograPhy and Geographic Information Science

Page 6: 50;93. *-88361 -6/ 90-4 (6;09874-;376! (5840506;-;376 … lit...%-:?50;93. *-88361 -6/ "90-4 (6;09874-;376! (5840506;-;376 -6/ &=-4

this experiment adjustments in the weighting per-centages were not made for different mapped vari-ables the same scheme was used for all variables

A major weakness of the three-class method isthat it does not account for the area of each partic-ular land use within each county A worst-case sce-nario would result for a county that had only one ortwo small urban polygons These polygons wouldstill receive 70 percent of the data for the entirecounty causing the urban areas in that county tohave unusually high densities and the other land-use areas to have lower densities

Method G2 represents a grid-based implemen-tation of method P2 As with the polygon versionthe method used a weighting scheme to assign datato different land-use classes within each countyFor counties with all three inhabitable land uses70 percent of county data were assigned to urbangrid cells 20 percent to agriculturalwoodland cellsand 10 percent to forested cells The method suf-fers from the same weaknesses as the P2 methodsubjectivity of the weightings and potential prob-lems with very small areas of a particular land usein a county As with Gl a major strength of thegrid three-class method was its easier implementa-tion in Avenue

Limiting Variable Method (P3)The final polygon method is the pure form of thelimiting variable method described by McCleary(1969) Wright (1936) also made use of this con-cept as did Gerth (1993) and Charpentier (1997)in their GIS implementations The first step in thismethod was to assign data by simple areal weight-ing to all inhabitable polygons in each county Forthe total population variable data were assignedso that the three inhabitable land-use types withina county (urban agriculturalwoodland and for-ested polygons) had equal population densitiesAt this step water was limited to zero densityNext we set thresholds of maximum density forparticular land uses and applied these throughoutthe study area For example agriculturalwoodlandareas were limited to 50 people per square kilome-ter and forested areas were assigned a lower thresh-old of 15 people per square kilometer for the totalpopulation variable we mapped degThefinal step inthe mapping process was use of these thresholdvalues to make adjustments to the data distributionwithin each county If a polygon density exceededits threshold it was assigned the threshold densityand the remaining data were removed from thearea These data were distributed evenly to theremaining zones in the county (Gerth 1993)

To decide the upper limits on the densitiesof the mapped variable for agriculturalwoodland

130

and forested land uses we examined data availableat the county level Eighteen counties in the data-set were classed entirely as agriculturalwoodlandand we used these to set thresholds For each ofthe sixvariables the agriculturalwoodland thresh-old was based on the value for the sixth highestcounty (70th percentile) The forested thresholdwas chosen using the same list of counties andbased on the value of the fourth lowest (20th per-centile) countyThis approach resulted in differentthresholds for each variable-a systematic custom-ization necessitated by differing magnitude rangesamong variables Only one county in the studyarea was fully forested yielding insufficient infor-mation for the selection of the maximum densitythreshold for forested polygons

Calculation and Analysis of ErrorMeasuresA varietyof waysof measuring error have been usedin areal interpolation research We chose to followFisherand Langford (1995)in their use of root meansquared (RMS)error and a coefficientof variation todescribe errors in dasymetric zones Other research-ers have used mean percent error and mean abso-lute percent error to summarize error (Goodchildetal 1993)We chose RMS error because it could beapplied to count data (eg total number of people)and waseasily interpreted as a value with the sameunits as the mapped variable

We began by calculating the RMS error foreach county using errors for dasymetric zoneswithin the countyThis calculation had three basicsteps1 Calculate the square of the difference between

the correct and interpolated counts for eachdasymetric zone

2 Calculate the mean of these squared errorswithin a county and

3 Take the square root of the mean to producea summary county error measure the RMSerrorThe correct data values for dasymetric zones

(needed for step 1) were calculated by summingpopulation or housing units for block groupswithin each dasymetric zone The coefficient ofvariation for each county was calculated by divid-ing the RMSerror by the correct county data value(number of people or housing units in 1990)Thisstandardized value allowed for direct comparisonacross the six mapped variables

A repeated-measures analysis of variance(ANOVA)wasused to test for significant differencesbetween mean coefficients of variation among thefive mapping methods Coefficients of variation

Cmtography and GeograPhic Information Science

Land use

D Agricultural

Woodland

Forested

bull Urban

Figure 3 Aggregated land-use data for study area

were the repeated measures compared within coun-ties (159 counties were the subjects in the within-subjects parlance of repeated-measures AiTOVA)

Results and Discussion

Statistical Analysis of ErrorsTable I lists means of coefficients of variation for159 counties for each of 30 maps examined in theanalysis Dasymetric methods are ranked by theright-most column that lists overall means for therows P3 had the smallest mean coefficient of varia-tion followed by P2 PI G2 and G1 This rankorder is consistent between variables with a fewtied coefficients and only a single pair of methodschanging places within anyone column For exam-ple G2 is ranked before PI in the first two housingvariables P3 is ranked first for all of the variables

Vol28 No2

This sample of variables contains more housingvariables than population variables but the con-sistency in ran kings of the dasymetric methodswithin variables indicates that this imbalance inthe sample does not bias the overall results

The ANOVA showed that the dasymetricmethod had a significant effect overall (p = 0000F =] 1164 df = 245]) based on the Greenhouse-Geiser statistic (SPSS Inc 1999) Thus the nullhypothesis that there is no difference in the levelof error among the methods was rejected Table2 lists the mean coefficients of variation for eachmethod (from the right column of Table I) andshows confidence limits calculated around thosemeans Significant differences between pairs ofmean coefficients of variation were tested usingthe Bonferroni method because it accounts formultiple comparisons and has been shown to beuseful for large sample sizes (SPSS Inc) The pair-wise results (Table 3) show that P3 the limitingvariable method (which had the lowest error for allvariables) was significantly different from all othermethods Thus the limiting variable method (P3)performed well in comparison to the other meth-ods

Although no other pairings of methods showedsignificant differences the three-class methodsperformed slightly better than the correspondingbinary methods (P2 was better than PI and G2 wasbetter than Gl) Though this difference was notsignificant in our work (Table 3) the result was notconsistent with previous research that showed sim-pler methods performing better than more com-plex versions (see Eicher 1999 p 102)

For a straightforward comparison of errors weadapted GI and P2 to the alternative data struc-ture (G1 to polygon-based PI and P2 to grid-basedG2) This redundancy was important because thegrid methods produced generally higher errorsthat might have confounded comparison (eg G2had lower errors than GI but higher error thanPI) This approach also allowed us to examinewhether there was a significant difference betweenmethods that use polygon land-use data and thosemethods that use grid land-use data Methods PIand GI and methods P2 and G2 were directly com-parable PI had lower error than Gland P2 hadlower error than G2 though in both cases the dif-ferences were not significant (Table 3)

In sum the results of the repeated-measuresANOVA supported three conclusions The tradi-tional limiting variable method (P3) produceddasymetric maps with significantly lower meanerror than maps produced using all other methodsBased on pairwise comparison of method PI to P2

131

House ValuesOverall mean ofDasymetric Total Black

Method Population Population $40000middot $50000middot $60000middot CY for method

Ilt$25000 49000 59000 75000

IP3 047 064 054 049 045 049 051

(limiting)P2 062 070 065 064 063 064 065

(3-class) -Pl 062 075 071 067 064 064 067

lbinary)G2 064 080 067 066 065 066 068(3-c1ass) -Gl 063 090 072 068 065 080 073(binary)

Overall mean of 060 076 066 063 061 065 065cv tor variableNotes Entries in the center portion of the table are means of 159 county coefficients of variation (cv) for 30 individual maps Values inthe right-most column are overall means of cv of each dasymetric method and values in the bottom row are overall means of cv ofeach variable

Table 1 Summary of county-level coefficients of variation of RMS error measures

Method Mean CY 95 Confidence IntervalP3 (limiting) 051 043 - 059P2 (3-c1ass) 065 055 - 074Pl (binary) 067 058 - 077

G2 (3-c1ass) 068 058 - 079Gl (binary) 073 064 - 082

tionships error maps will be presented allowingvisual analysis of spatial error patterns Presenta-tion of this type of data in tandem with the dasy-metric maps may generate hypotheses and ideasfor future research

Example Map SeriesFigures45 and 6 present (a) dasymetric maps (b)error maps based on percent error and (c) errormaps based on count error for each of the three

P3P3 P2

PlP2 0001 G2Pl 0000 1000G2 0000 0313 0059Gl 0000 0609 1000 1000

sig nificant differences

Table 3 Probabilities for differences in errors for pairs of dasymetricmethods

and method G1 to G2 thethree-class methods werefound to perform slightlybetter than the binary meth-ods though the overall dif-ference was not significantBased on pairwise compar-ison of method PI to G1 cv = coefficient of variation of RMS county errors

and method P2 to G2 poly- Table 2 Estimated mean errors and confidence intervals for dasymetric methodsgon methods were found toproduce slightly more accu-rate maps than grid methods thoughthe difference was not significant

Many of the papers we reviewed for thiswork especially research on areal inter-polation lacked a visual presentation ofresults Papers with maps (eg Langfordet al 1990 Fisher and Langford 1996Charpentier 1997) typically focused onsmall study areas We now present dasymetricmaps resulting from the three polygon methodstested in this experiment (see also Eicher (1999)for corresponding grid-based maps)

Some of the work in areal interpolation soughtto describe the types of error produced (Cockingset al 1997 Langford et al 1991) For examplethese studies formally tested the connections ofgeometry and attribute characteristics to map errorWhile this section will not formally test these rela-

Visual Analysis of DasymetricMaps and Errors

132 CartograPhy and GeograPhic Information Science

polygon mapping methods The dasymetric mapsare density representations of the total populationvariable Boundaries for all dasymetric zones areshown on these maps Percent error maps showerror relative to the total population of a polygonOne possible weakness of percent error maps isthat rural areas generally will have higher percenterrors because of lower totals within these zonesFurthermore when presented in map form thesehigher errors dominate because rural map zonestend to be larger than urban map zones

Count errors are presented to provide a viewof the absolute error present across the study area(map C in Figures 4 5 and 6) Maps of count errorwill not signal high errors in rural areas with lowpopulations as seen on the percent error mapsWhen count errors are mapped the reader getsmore of an impression of the overall quantity ofthe variable that has been mapped inaccurately Amajor weakness of presenting count data with arealfills however is that larger polygons will tend tohave higher values than smaller polygons with sim-ilar characteristics Despite this recognized prob-lem with mapping counts consistent symbolizationwas maintained to aid comparison with the othertwo map types (a and b) within the figures

Data ranges for classifications are sharedamong corresponding maps in Figures 4 5 and 6to aid comparison The extreme classeshave variedmaximums because the calculated maximum pop-ulation density varied widely depending on themapping method Note that adjacent classes in leg-ends share data breaks namely the upper valuefor the first class (25 personskm2) is the same asthe lower value for the second class the uppervalue for the second class (75 personskm2) is thesame as the lower value for the third class etcEach class should be interpreted as encompassingall values up to but not including the highest valuelisted in the legend for the class This method ofdescribing class breaks allows precise definition ofbreaks without the use of excessiveprecision in sig-nificant digits that clutter a legend and make mapuse difficult (Monmonier 1982)

Binary Method (Pl)The dasymetric map in Figure 4a clearly shows thebinary distinction made in this method betweeninhabitable and uninhabitable areas All areas des-ignated as forested (a large portion of the map)have been assigned zero population The distinc-tion is most obvious in the ridge-and-valley phys-iographic province running diagonally across thestudy region where ridges appear as white (for-ested areas that are assigned no population) and

11Jl 28 No2

valleys are yellow-greens (populated urban andagriculturalwoodland areas)

The percent error map (Figure 4b) as expectedshows under-prediction (grays) for the forestedareas although urban areas have also been under-predicted The latter occurs because urban areasare not distinguished from agriculturalwoodlandareas by the method receiving the same populationdensity when they are in the same county Manyagriculturalwoodland areas are over-predicted forthe same reason especially in areas boundingmajor metropolitan areas (ashington BaltimorePittsburgh) The count error map (Figure 4c) showsslightly different patterns of error Most interest-ing are the relatively low errors in rural forestedareas Though all these areas have -100 percenterror (because they were assigned zero popula-tion) counts are comparatively low because popu-lations are low

Three-Class Method (P2)The dasymetric map in Figure 5a shows that nearlyall areas received population with the exception ofwater The map also distinguishes between urbanand agriculturalwoodland and thus urban placeslike York Erie and Harrisburg (see Figure 1 forlocations) have visiblyhigher populations than thesurrounding rural areas

The percent error map (Figure 5b) shows thaturban areas are either predicted correctly or under-predicted (eg Pittsburgh Washington Baltimore)This map also shows that forested areas are under-predicted especially in the ridge-and-valley prov-ince Conversely the valleys typically classifiedas agriculturalwoodland are over-predicted Theunder-prediction of forested areas in most countiessuggests that not enough population was assignedto the forested land use in the 70-20-10 percentweighting scheme Again the map of count error(Figure 5c) shows lower errors throughout theridge-and-valley province because populations arelower in the mostly rural area

Limiting Variable Method (P3)The dasymetric map produced using the P3method (Figure 6a) which had the lowest overallerror closely resembles the result from method P2which had the next best error measures Slight dif-ferences in dasymetric map patterns for the twomethods are noticeable in northwest Pennsylvaniaalong the top edge of the study area as well as incentral Virginia along the southern edge of themap The limiting variable method is also similarto method P2 in that few areas of the map havezero population

133

a Population Density b Percent Error c Count Error(persons per km2)

050 to 37233 20000 to 38852

25 to 50 10000 to 20000o to 25

10 to 25 5000 to 1000025 to 75

-10 to 10 -5000 to 5000

75 to 175-25 to -10 -10000 to -5000

175 to 375 -50 to -25 -20000 to -10000

375 to 13551 -100 to -50 -395877 to -20000

Figure 4 Maps for binary method (Pl) a dasymetric population density map b percent error and c counterror in number of persons

The pattern of error for the P3 method (Fig-ures 6b and 6c) is different however from thefirst two methods An interesting radial pattern isevident around both Washington and BaltimoreThese areas are unique because they have large met-ropolitan populations and the municipal boundar-ies of each designate them as independent citiesessentially small counties in this analysis so thesecentral cities are well predicted Immediately out-ward however are over-predicted areas followedby a ring of under-predicted areas at the fringesof the metropolitan areas The boundary betweenover-prediction and under-prediction is the func-tional border between urban and agriculturalwoodland land use The Pittsburgh area has asimilar pattern with a central urban area that iswell predicted surrounded by areas of agriculturalwoodland land use that are under-predicted

134

For major metropolitan areas we concludethat the threshold value for agriculturalwoodlandareas was set too low As a result in the case ofWashington the outermost suburbs (classified asagriculturalwoodland) received the threshold den-sity of 15 persons per km2 while too much datawas then carried over and assigned to more trulysuburban areas (classified as urban) closer to thecity center In the case of Pittsburgh a smallercity suburbs immediately proximate to the citywere classified as agriculturalwoodland assignedthe corresponding agriculturalwoodland densitythreshold and therefore under-predicted The ageof the land-use data could also be a factor inboth cases because areas classified as agriculturalwoodland may have been developed between 1970and 1990

CartograPhy and GeograPhic Information Science

a Population Density b Percent Error c Count Error(persons per km2)

050 to 127028 20000 to 26418

25 to 50 10000 to 20000o to 25

10 to 25 5000 to 1000025 to 75

-10 to 10 -5000 to 5000

75 to 175-25to-10 -10000 to -5000

175 to 375 -50 to -25 -20000 to -10000

375 to 16525064 -100 to -50 -288114 to -20000

Figure 5 Maps for three-class method (P2) a dasymetric population density map b percent error and ccount error in number of persons

Conclusions and Future Research

Research Question A Mapping MethodsThere were significant differences in accuracy forthe five dasymetric mapping methods tested inthis experiment The limiting variable method(P3) produced maps with significantly lower errorthan maps made using the other methods Three-class methods (P2 and G2) produced maps witherrors that were not significantly different thanthe errors of maps produced with binary methods(PI and GI)

Though the traditional limiting variablemethod (P3) is over a half-century old (Wright1936) we have shown that it can be implementedin a GIS to produce accurate dasymetric mapsThe success of this method may in part be due

11il28 No2

to its customized approach Threshold values usedto shift data between zones were based on thedata distribution for each mapped variable Thisapproach contrasts with three-class methods P2and G2 in which the same 70-20-10 percentageweightings were applied to all variables Futurework could examine the effect of using differentthreshold values for the land-use classesAlso impor-tant are methods of determining the P3 thresholdvalues Perhaps the reassignment schemes used forP3 could also be made less arbitrary and morespecific to the geography of each source zone bymaking the thresholds dependent on the land-useareas within each source zone (Mennis 1999) Forexample a threshold of 20 people per square kilo-meter used for agriculturaVwoodland areas couldbe increased to 30 in counties with more than 50

135

percent urban land uses In the future a standard-ized and generalizable decision process with a sta-tistical basis might be developed for this purpose

Research Question B Polygon Methodsversus Grid MethodsBased on a pairwise comparison of polygon andgrid versions of the binary method (PI and GI)and the three-class method (P2 and G2) errors forpolygon dasymetric maps were not significantlydifferent than for grid maps In our implementa-tion the use of grid methods was more convenientWithin the ArcView environment grid (raster) ver-sions of our datasets took up less storage spaceand the grid methods were easier to implementusing Avenue scripting Processing was also fasterin ArcViewbecause shorter scripts were typical ofthe grid methods (Eicher 1999) Grid methodswere developed in the areal interpolation litera-ture but have not been popular for dasymetricmapping Their effectiveness in producing dasy-metric maps in this experiment as well as theirease of implementation suggests that they shouldbe used more frequently In contrast advantagesof the equally accurate polygon methods includegreater flexibility for publication-quality mappingsuch as ease of adjusting line weights to suit differ-ent presentation sizes

Future work could study the effect of grid cellsize on the accuracy of dasymetric maps The rela-tionship between cell size and map scale is espe-cially interesting A research project focusing ongrid dasymetric maps could determine optimalgrid cell sizes for varied map scales For examplewe expect that use of excessively small grid cellsfor a given scale would lead to higher map errorSuch research would also provide an analyticalmethod of balancing increases in accuracy affordedby smaller grid cells with the increased storagerequirements and processing times associated withlarger data files See Eicher (1999) for further dis-cussion of this topic

Research Question C Dasymetric Mapsand Error MapsVisual inspection of dasymetric maps and error mapssupported the results from the statisticalanalysispor-tion of the experiment and offered additional insightinto the usefulness of dasymetric maps It is our opin-ion that the resulting dasymetric maps are visuallyeffective showing more accurate distributions thancounty aggregations provide

Classification (the data ranges represented bymap colors) for dasymetric maps was not closely

136

examined in this work Future work could investi-gate whether classifications that have been shownto work well for choropleth maps also work wellfor dasymetric maps An interesting possibility isthe use of contiguity-based classifications (Cromley1996) which might make dasymetric maps visually

more dasymetric Contiguity-based classificationsconsider differences across boundaries betweenadjacent polygons and this strategy is particularlywell suited for application to boundaries on dasy-metric maps Dasymetric boundaries are moreclosely related to the underlying data distributionthan boundaries on choropleth maps because dasy-metric zones are more internally homogeneous

Many possibilities exist for extending calcula-tion and comparison of error in dasymetric mapsFuture work could include a number of between-county factors or covariates in the repeated-mea-sures ANOVAThese could test the relationshipbetween map error and land-use areas popula-tion densitygeographic location and other factorsAlso different approaches could be taken to evalu-ating the quality of dasymetric maps in an exper-iment A defining property of dasymetric mapsis homogeneity within map zones however thisproperty was not formally tested in this researchIt may be possible to compute the variance foreach map zone using values from the block groupscomprising that zone Variances within each mapzone could then be compared to variances betweenmap zones to establish a measure of dasymetricity(homogeneity within zones combined with distinct-ness of differences between adjacent zones) To fur-ther examine dasymetric mapping methods andthe properties of dasymetric maps a number ofdifferent map zone configurations could be usedFor example maps resulting from different ancil-lary variables such as land use soil types or slopezones could be compared Likewisemaps resultingfrom different source zones such as counties votingdistricts and economic planning areas could becompared

SummaryWith a structured experiment we determinedthat the traditional limiting variable method wasthe most accurate of those tested and that griddasymetric mapping methods produced maps ofsimilar accuracy to their polygonal counterpartsWrights limiting variable concept developed 65years ago has maintained its utility This tradi-tional method has been updated and adapted suc-cessfully to produce dasymetric maps in a GISmapping environment Additionally the availabil-

CartogmPhy and Geogmphic Information Science

ity of GIS technology has made data processingfor dasymetric map production a more pragmaticprocess Work (such as this paper) on dasymetricmapping research on areal interpolation andincreased use of GIS by cartographers are firststeps toward producing a formal set of dasymetricmapping methods that updates George McClearys32-year-old typology GIS software packages mayeventually offer full dasymetric mapping function-ality or at least provide the means to make thesetypes of maps more easily

ACKNOWLEDGEMENTSSpecial thanks must be given to Jeremy Mennisand Donna Peuquet for their comments and con-structive criticism on this research We appreciateTerry Slocums careful edits of the manuscript

REFERENCESAnderson J R 1967 Major land uses in the United

States In US Geological SurveyNational atlas of theUnited States of America U S Geological SurveyWash-ington DC pp 158-9

Bloom L M PJ Pedler and G E Wragg 1996 Imple-mentation of enhanced areal interpolation using Map-info Computers amp Geosciences 22(5) 459-66

Charpentier M 1997 The dasymetric method with com-ments on its use in GIS In 1997 UCGISAnnualAssem-bly amp Summer Retreat The University Consortium forGeographic Information Science June 15-20 BarHarbor Maine

Chrisman N R 1997 Exploring geograPhic informationsystems New York New YorkJohn Wileyamp Sons

Chrisman N R 1998 Rethinking levels of measure-ment for cartography Cartography and GeograPhicInformation Systems 25(4) 231-42

Cockings S P Fisher and M Langford 1997 Param-eterization and visualization of the errors in arealinterpolation GeograPhical Analysis 29(4) 314-328

Cromley R G 1996 A comparison of optimal classifica-tion strategies for choroplethic displays of spatiallyaggregated data International journal of GeograPhicalInformation Systems 10(4) 405-24

DeMers M N 1997 Fundamentals of geograPhic informa-tion systems New York New YorkJohn Wileyamp Sons

Eicher C L 1999 Implementation and evaluation ofdasymetric mapping methods MS Thesis Pennsyl-vania State University

Fisher Pand M Langford 1995 Modeling the errors inareal interpolation between zonal systems by MonteCarlo simulation Environment and Planning A 27211-24

Fisher Pand M Langford 1996 Modeling sensitivity toaccuracy in classified imagery A study of areal inter-polation by dasymetric mapping Professional Geogra-pher 48(3) 299-309

Flowerdew R and M Green 1989 Statistical methodsfor inference between incompatible zonal systemsIn M Goodchild and S Gopal (eds) Accuracy of spa-

Vol 28 No2

tial databases London UK Taylor and Francis pp239-47

Flowerdew R and M Green 1991 Data integrationStatistical methods for transferring data betweenzonal systems In l Masser and M B Blakemore(eds) Handling geographical information Essex UKLongman Scientific amp Technical pp 38-53

Flowerdew R and M Green 1992 Developmentsin areal interpolation methods and GIS Annals ofRegional Science 26 67 -78

Flowerdew R and M Green 1994 Areal interpolationand types of data In S Fotheringham and P Rog-erson (eds) Spatial analysis and GIS London uKTaylor and Francis pp 121-45

Flowerdew R M Green and E Kehris 1991 Usingareal interpolation methods in GIS Papers in RegionalScience The journal of the RSAI 70(3) 303-15

Gerth J 1993 Towards improved spatial analysis withareal units The use of GIS to facilitate the creation ofdasymetric maps Masters Paper Ohio State Univer-sity

Goodchild M F and N S Lam 1980 Areal interpola-tion A variant of the traditional spatial problem Geo-processing I 297-312

Goodchild M F L Anselin and U Deichmann 1993A framework for the areal interpolation of socioeco-nomic data Environment and Planning A 25 383-97

Holloway S R J Schumacher and R Redmond 1996People and place Dasymetric mapping using Arc)Info Missoula Wildlife Spatial Analysis Lab Univer-sity of Montana

Hitt K J 1991 Digital map file of major land uses inthe United States US Geological Survey AccessedOctober 1998 ArcInfo coverage available [romhttpwa ter usgs govG ISmetadatausgswrdllandusehtml

Langford M D J Maguire and D J Unwin 1990Mapping the density of population Continuous sur-face representations as an alternative to choroplethand dasymetric maps Leicester UK MidlandsRegional Research Laboratory University of Leicesterand Loughborough University of Technology UK

Langford M D J Maguire and D J Unwin 1991The Areal interpolation problem Estimating popula-tion using remote sensing in a GIS framework Inl Masser and M B Blakemore (eds) Handling geo-graPhic information Essex UK Longman Scientificamp Technical pp 55-77

Langford M and D J Unwin 1994 Generating andmapping population density surfaces within a geo-graphical information system The Cartographic jour-nal31 21-6

MacEachren A M 1994 Some truth with maps A primeron symbolimtion and design Washington DC Associa-tion of American Geographers

McCleary G FJr 1969 The dasymetric method in the-matic cartography PhD Dissertation University ofWisconsin at Madison

McCleary G FJr 1984 Cartography geography andthe dasymetric method In Proceeding 12th Conferenceof International CartograPhic Association August 6-13Perth Australia 1599-610

137

c Count Error

20000 to 47335

10000 to 20000

5000 to 10000

-5000 to 5000

-10000 to -5000

-20000 to -10000

-382151 to -20000

Figure 6 Maps for traditional limiting variable method (P3) a dasymetric population density map bpercent error and c count error in number of persons

Mennis J 1999 Using GIS spatial statistics and visual-ization to investigate environmental justice Paper pre-sented at the Joint Statistical Meetings American Statisti-cal Association Baltimore Md August 8-12 1999

Monmonier M 1982 Flat laxity optimization androunding in the selection of class intervals Cartograph-ica 19(1) 16-27

Monmonier M and G A Schnell 1984 Land use andland cover data and the mapping of population den-sity International Yearbook of CartograPhy 24 24-9

Moxey A and P Allanson 1994 Areal interpolation ofspatially extensive variables A comparison of alterna-tive techniques International journal of GeograPhicalInformation Systems 8(5) 479-87

138

Robinson A J Morrison P Muehrcke A Kimerlingand S Guptill 1995 Elements of cartograPhy 6th edNew York New York John Wiley amp Sons

Slocum T A 1999 Thematic cartography and visualimtionUpper Saddle River New Jersey Prentice Hall

SPSS Inc 1999 SPSS help venion 90 SPSS Incorpo-rated Chicago Illinois

US Geological Survey 1970National atlas of the United Statesof Ame1Ua us Geological Surveyashington DC

US Geological Survey 1993 National water summary1990-91 Hydrologic events and stream water qualityUS Geological Survey water supply paper 2400 USGeological Survey Washington DC

Wright I K 1936 A method of mapping densities ofpopulation The GeograPhical Review pp 103-10

CartograPhy and Geographic Information Science

Page 7: 50;93. *-88361 -6/ 90-4 (6;09874-;376! (5840506;-;376 … lit...%-:?50;93. *-88361 -6/ "90-4 (6;09874-;376! (5840506;-;376 -6/ &=-4

Land use

D Agricultural

Woodland

Forested

bull Urban

Figure 3 Aggregated land-use data for study area

were the repeated measures compared within coun-ties (159 counties were the subjects in the within-subjects parlance of repeated-measures AiTOVA)

Results and Discussion

Statistical Analysis of ErrorsTable I lists means of coefficients of variation for159 counties for each of 30 maps examined in theanalysis Dasymetric methods are ranked by theright-most column that lists overall means for therows P3 had the smallest mean coefficient of varia-tion followed by P2 PI G2 and G1 This rankorder is consistent between variables with a fewtied coefficients and only a single pair of methodschanging places within anyone column For exam-ple G2 is ranked before PI in the first two housingvariables P3 is ranked first for all of the variables

Vol28 No2

This sample of variables contains more housingvariables than population variables but the con-sistency in ran kings of the dasymetric methodswithin variables indicates that this imbalance inthe sample does not bias the overall results

The ANOVA showed that the dasymetricmethod had a significant effect overall (p = 0000F =] 1164 df = 245]) based on the Greenhouse-Geiser statistic (SPSS Inc 1999) Thus the nullhypothesis that there is no difference in the levelof error among the methods was rejected Table2 lists the mean coefficients of variation for eachmethod (from the right column of Table I) andshows confidence limits calculated around thosemeans Significant differences between pairs ofmean coefficients of variation were tested usingthe Bonferroni method because it accounts formultiple comparisons and has been shown to beuseful for large sample sizes (SPSS Inc) The pair-wise results (Table 3) show that P3 the limitingvariable method (which had the lowest error for allvariables) was significantly different from all othermethods Thus the limiting variable method (P3)performed well in comparison to the other meth-ods

Although no other pairings of methods showedsignificant differences the three-class methodsperformed slightly better than the correspondingbinary methods (P2 was better than PI and G2 wasbetter than Gl) Though this difference was notsignificant in our work (Table 3) the result was notconsistent with previous research that showed sim-pler methods performing better than more com-plex versions (see Eicher 1999 p 102)

For a straightforward comparison of errors weadapted GI and P2 to the alternative data struc-ture (G1 to polygon-based PI and P2 to grid-basedG2) This redundancy was important because thegrid methods produced generally higher errorsthat might have confounded comparison (eg G2had lower errors than GI but higher error thanPI) This approach also allowed us to examinewhether there was a significant difference betweenmethods that use polygon land-use data and thosemethods that use grid land-use data Methods PIand GI and methods P2 and G2 were directly com-parable PI had lower error than Gland P2 hadlower error than G2 though in both cases the dif-ferences were not significant (Table 3)

In sum the results of the repeated-measuresANOVA supported three conclusions The tradi-tional limiting variable method (P3) produceddasymetric maps with significantly lower meanerror than maps produced using all other methodsBased on pairwise comparison of method PI to P2

131

House ValuesOverall mean ofDasymetric Total Black

Method Population Population $40000middot $50000middot $60000middot CY for method

Ilt$25000 49000 59000 75000

IP3 047 064 054 049 045 049 051

(limiting)P2 062 070 065 064 063 064 065

(3-class) -Pl 062 075 071 067 064 064 067

lbinary)G2 064 080 067 066 065 066 068(3-c1ass) -Gl 063 090 072 068 065 080 073(binary)

Overall mean of 060 076 066 063 061 065 065cv tor variableNotes Entries in the center portion of the table are means of 159 county coefficients of variation (cv) for 30 individual maps Values inthe right-most column are overall means of cv of each dasymetric method and values in the bottom row are overall means of cv ofeach variable

Table 1 Summary of county-level coefficients of variation of RMS error measures

Method Mean CY 95 Confidence IntervalP3 (limiting) 051 043 - 059P2 (3-c1ass) 065 055 - 074Pl (binary) 067 058 - 077

G2 (3-c1ass) 068 058 - 079Gl (binary) 073 064 - 082

tionships error maps will be presented allowingvisual analysis of spatial error patterns Presenta-tion of this type of data in tandem with the dasy-metric maps may generate hypotheses and ideasfor future research

Example Map SeriesFigures45 and 6 present (a) dasymetric maps (b)error maps based on percent error and (c) errormaps based on count error for each of the three

P3P3 P2

PlP2 0001 G2Pl 0000 1000G2 0000 0313 0059Gl 0000 0609 1000 1000

sig nificant differences

Table 3 Probabilities for differences in errors for pairs of dasymetricmethods

and method G1 to G2 thethree-class methods werefound to perform slightlybetter than the binary meth-ods though the overall dif-ference was not significantBased on pairwise compar-ison of method PI to G1 cv = coefficient of variation of RMS county errors

and method P2 to G2 poly- Table 2 Estimated mean errors and confidence intervals for dasymetric methodsgon methods were found toproduce slightly more accu-rate maps than grid methods thoughthe difference was not significant

Many of the papers we reviewed for thiswork especially research on areal inter-polation lacked a visual presentation ofresults Papers with maps (eg Langfordet al 1990 Fisher and Langford 1996Charpentier 1997) typically focused onsmall study areas We now present dasymetricmaps resulting from the three polygon methodstested in this experiment (see also Eicher (1999)for corresponding grid-based maps)

Some of the work in areal interpolation soughtto describe the types of error produced (Cockingset al 1997 Langford et al 1991) For examplethese studies formally tested the connections ofgeometry and attribute characteristics to map errorWhile this section will not formally test these rela-

Visual Analysis of DasymetricMaps and Errors

132 CartograPhy and GeograPhic Information Science

polygon mapping methods The dasymetric mapsare density representations of the total populationvariable Boundaries for all dasymetric zones areshown on these maps Percent error maps showerror relative to the total population of a polygonOne possible weakness of percent error maps isthat rural areas generally will have higher percenterrors because of lower totals within these zonesFurthermore when presented in map form thesehigher errors dominate because rural map zonestend to be larger than urban map zones

Count errors are presented to provide a viewof the absolute error present across the study area(map C in Figures 4 5 and 6) Maps of count errorwill not signal high errors in rural areas with lowpopulations as seen on the percent error mapsWhen count errors are mapped the reader getsmore of an impression of the overall quantity ofthe variable that has been mapped inaccurately Amajor weakness of presenting count data with arealfills however is that larger polygons will tend tohave higher values than smaller polygons with sim-ilar characteristics Despite this recognized prob-lem with mapping counts consistent symbolizationwas maintained to aid comparison with the othertwo map types (a and b) within the figures

Data ranges for classifications are sharedamong corresponding maps in Figures 4 5 and 6to aid comparison The extreme classeshave variedmaximums because the calculated maximum pop-ulation density varied widely depending on themapping method Note that adjacent classes in leg-ends share data breaks namely the upper valuefor the first class (25 personskm2) is the same asthe lower value for the second class the uppervalue for the second class (75 personskm2) is thesame as the lower value for the third class etcEach class should be interpreted as encompassingall values up to but not including the highest valuelisted in the legend for the class This method ofdescribing class breaks allows precise definition ofbreaks without the use of excessiveprecision in sig-nificant digits that clutter a legend and make mapuse difficult (Monmonier 1982)

Binary Method (Pl)The dasymetric map in Figure 4a clearly shows thebinary distinction made in this method betweeninhabitable and uninhabitable areas All areas des-ignated as forested (a large portion of the map)have been assigned zero population The distinc-tion is most obvious in the ridge-and-valley phys-iographic province running diagonally across thestudy region where ridges appear as white (for-ested areas that are assigned no population) and

11Jl 28 No2

valleys are yellow-greens (populated urban andagriculturalwoodland areas)

The percent error map (Figure 4b) as expectedshows under-prediction (grays) for the forestedareas although urban areas have also been under-predicted The latter occurs because urban areasare not distinguished from agriculturalwoodlandareas by the method receiving the same populationdensity when they are in the same county Manyagriculturalwoodland areas are over-predicted forthe same reason especially in areas boundingmajor metropolitan areas (ashington BaltimorePittsburgh) The count error map (Figure 4c) showsslightly different patterns of error Most interest-ing are the relatively low errors in rural forestedareas Though all these areas have -100 percenterror (because they were assigned zero popula-tion) counts are comparatively low because popu-lations are low

Three-Class Method (P2)The dasymetric map in Figure 5a shows that nearlyall areas received population with the exception ofwater The map also distinguishes between urbanand agriculturalwoodland and thus urban placeslike York Erie and Harrisburg (see Figure 1 forlocations) have visiblyhigher populations than thesurrounding rural areas

The percent error map (Figure 5b) shows thaturban areas are either predicted correctly or under-predicted (eg Pittsburgh Washington Baltimore)This map also shows that forested areas are under-predicted especially in the ridge-and-valley prov-ince Conversely the valleys typically classifiedas agriculturalwoodland are over-predicted Theunder-prediction of forested areas in most countiessuggests that not enough population was assignedto the forested land use in the 70-20-10 percentweighting scheme Again the map of count error(Figure 5c) shows lower errors throughout theridge-and-valley province because populations arelower in the mostly rural area

Limiting Variable Method (P3)The dasymetric map produced using the P3method (Figure 6a) which had the lowest overallerror closely resembles the result from method P2which had the next best error measures Slight dif-ferences in dasymetric map patterns for the twomethods are noticeable in northwest Pennsylvaniaalong the top edge of the study area as well as incentral Virginia along the southern edge of themap The limiting variable method is also similarto method P2 in that few areas of the map havezero population

133

a Population Density b Percent Error c Count Error(persons per km2)

050 to 37233 20000 to 38852

25 to 50 10000 to 20000o to 25

10 to 25 5000 to 1000025 to 75

-10 to 10 -5000 to 5000

75 to 175-25 to -10 -10000 to -5000

175 to 375 -50 to -25 -20000 to -10000

375 to 13551 -100 to -50 -395877 to -20000

Figure 4 Maps for binary method (Pl) a dasymetric population density map b percent error and c counterror in number of persons

The pattern of error for the P3 method (Fig-ures 6b and 6c) is different however from thefirst two methods An interesting radial pattern isevident around both Washington and BaltimoreThese areas are unique because they have large met-ropolitan populations and the municipal boundar-ies of each designate them as independent citiesessentially small counties in this analysis so thesecentral cities are well predicted Immediately out-ward however are over-predicted areas followedby a ring of under-predicted areas at the fringesof the metropolitan areas The boundary betweenover-prediction and under-prediction is the func-tional border between urban and agriculturalwoodland land use The Pittsburgh area has asimilar pattern with a central urban area that iswell predicted surrounded by areas of agriculturalwoodland land use that are under-predicted

134

For major metropolitan areas we concludethat the threshold value for agriculturalwoodlandareas was set too low As a result in the case ofWashington the outermost suburbs (classified asagriculturalwoodland) received the threshold den-sity of 15 persons per km2 while too much datawas then carried over and assigned to more trulysuburban areas (classified as urban) closer to thecity center In the case of Pittsburgh a smallercity suburbs immediately proximate to the citywere classified as agriculturalwoodland assignedthe corresponding agriculturalwoodland densitythreshold and therefore under-predicted The ageof the land-use data could also be a factor inboth cases because areas classified as agriculturalwoodland may have been developed between 1970and 1990

CartograPhy and GeograPhic Information Science

a Population Density b Percent Error c Count Error(persons per km2)

050 to 127028 20000 to 26418

25 to 50 10000 to 20000o to 25

10 to 25 5000 to 1000025 to 75

-10 to 10 -5000 to 5000

75 to 175-25to-10 -10000 to -5000

175 to 375 -50 to -25 -20000 to -10000

375 to 16525064 -100 to -50 -288114 to -20000

Figure 5 Maps for three-class method (P2) a dasymetric population density map b percent error and ccount error in number of persons

Conclusions and Future Research

Research Question A Mapping MethodsThere were significant differences in accuracy forthe five dasymetric mapping methods tested inthis experiment The limiting variable method(P3) produced maps with significantly lower errorthan maps made using the other methods Three-class methods (P2 and G2) produced maps witherrors that were not significantly different thanthe errors of maps produced with binary methods(PI and GI)

Though the traditional limiting variablemethod (P3) is over a half-century old (Wright1936) we have shown that it can be implementedin a GIS to produce accurate dasymetric mapsThe success of this method may in part be due

11il28 No2

to its customized approach Threshold values usedto shift data between zones were based on thedata distribution for each mapped variable Thisapproach contrasts with three-class methods P2and G2 in which the same 70-20-10 percentageweightings were applied to all variables Futurework could examine the effect of using differentthreshold values for the land-use classesAlso impor-tant are methods of determining the P3 thresholdvalues Perhaps the reassignment schemes used forP3 could also be made less arbitrary and morespecific to the geography of each source zone bymaking the thresholds dependent on the land-useareas within each source zone (Mennis 1999) Forexample a threshold of 20 people per square kilo-meter used for agriculturaVwoodland areas couldbe increased to 30 in counties with more than 50

135

percent urban land uses In the future a standard-ized and generalizable decision process with a sta-tistical basis might be developed for this purpose

Research Question B Polygon Methodsversus Grid MethodsBased on a pairwise comparison of polygon andgrid versions of the binary method (PI and GI)and the three-class method (P2 and G2) errors forpolygon dasymetric maps were not significantlydifferent than for grid maps In our implementa-tion the use of grid methods was more convenientWithin the ArcView environment grid (raster) ver-sions of our datasets took up less storage spaceand the grid methods were easier to implementusing Avenue scripting Processing was also fasterin ArcViewbecause shorter scripts were typical ofthe grid methods (Eicher 1999) Grid methodswere developed in the areal interpolation litera-ture but have not been popular for dasymetricmapping Their effectiveness in producing dasy-metric maps in this experiment as well as theirease of implementation suggests that they shouldbe used more frequently In contrast advantagesof the equally accurate polygon methods includegreater flexibility for publication-quality mappingsuch as ease of adjusting line weights to suit differ-ent presentation sizes

Future work could study the effect of grid cellsize on the accuracy of dasymetric maps The rela-tionship between cell size and map scale is espe-cially interesting A research project focusing ongrid dasymetric maps could determine optimalgrid cell sizes for varied map scales For examplewe expect that use of excessively small grid cellsfor a given scale would lead to higher map errorSuch research would also provide an analyticalmethod of balancing increases in accuracy affordedby smaller grid cells with the increased storagerequirements and processing times associated withlarger data files See Eicher (1999) for further dis-cussion of this topic

Research Question C Dasymetric Mapsand Error MapsVisual inspection of dasymetric maps and error mapssupported the results from the statisticalanalysispor-tion of the experiment and offered additional insightinto the usefulness of dasymetric maps It is our opin-ion that the resulting dasymetric maps are visuallyeffective showing more accurate distributions thancounty aggregations provide

Classification (the data ranges represented bymap colors) for dasymetric maps was not closely

136

examined in this work Future work could investi-gate whether classifications that have been shownto work well for choropleth maps also work wellfor dasymetric maps An interesting possibility isthe use of contiguity-based classifications (Cromley1996) which might make dasymetric maps visually

more dasymetric Contiguity-based classificationsconsider differences across boundaries betweenadjacent polygons and this strategy is particularlywell suited for application to boundaries on dasy-metric maps Dasymetric boundaries are moreclosely related to the underlying data distributionthan boundaries on choropleth maps because dasy-metric zones are more internally homogeneous

Many possibilities exist for extending calcula-tion and comparison of error in dasymetric mapsFuture work could include a number of between-county factors or covariates in the repeated-mea-sures ANOVAThese could test the relationshipbetween map error and land-use areas popula-tion densitygeographic location and other factorsAlso different approaches could be taken to evalu-ating the quality of dasymetric maps in an exper-iment A defining property of dasymetric mapsis homogeneity within map zones however thisproperty was not formally tested in this researchIt may be possible to compute the variance foreach map zone using values from the block groupscomprising that zone Variances within each mapzone could then be compared to variances betweenmap zones to establish a measure of dasymetricity(homogeneity within zones combined with distinct-ness of differences between adjacent zones) To fur-ther examine dasymetric mapping methods andthe properties of dasymetric maps a number ofdifferent map zone configurations could be usedFor example maps resulting from different ancil-lary variables such as land use soil types or slopezones could be compared Likewisemaps resultingfrom different source zones such as counties votingdistricts and economic planning areas could becompared

SummaryWith a structured experiment we determinedthat the traditional limiting variable method wasthe most accurate of those tested and that griddasymetric mapping methods produced maps ofsimilar accuracy to their polygonal counterpartsWrights limiting variable concept developed 65years ago has maintained its utility This tradi-tional method has been updated and adapted suc-cessfully to produce dasymetric maps in a GISmapping environment Additionally the availabil-

CartogmPhy and Geogmphic Information Science

ity of GIS technology has made data processingfor dasymetric map production a more pragmaticprocess Work (such as this paper) on dasymetricmapping research on areal interpolation andincreased use of GIS by cartographers are firststeps toward producing a formal set of dasymetricmapping methods that updates George McClearys32-year-old typology GIS software packages mayeventually offer full dasymetric mapping function-ality or at least provide the means to make thesetypes of maps more easily

ACKNOWLEDGEMENTSSpecial thanks must be given to Jeremy Mennisand Donna Peuquet for their comments and con-structive criticism on this research We appreciateTerry Slocums careful edits of the manuscript

REFERENCESAnderson J R 1967 Major land uses in the United

States In US Geological SurveyNational atlas of theUnited States of America U S Geological SurveyWash-ington DC pp 158-9

Bloom L M PJ Pedler and G E Wragg 1996 Imple-mentation of enhanced areal interpolation using Map-info Computers amp Geosciences 22(5) 459-66

Charpentier M 1997 The dasymetric method with com-ments on its use in GIS In 1997 UCGISAnnualAssem-bly amp Summer Retreat The University Consortium forGeographic Information Science June 15-20 BarHarbor Maine

Chrisman N R 1997 Exploring geograPhic informationsystems New York New YorkJohn Wileyamp Sons

Chrisman N R 1998 Rethinking levels of measure-ment for cartography Cartography and GeograPhicInformation Systems 25(4) 231-42

Cockings S P Fisher and M Langford 1997 Param-eterization and visualization of the errors in arealinterpolation GeograPhical Analysis 29(4) 314-328

Cromley R G 1996 A comparison of optimal classifica-tion strategies for choroplethic displays of spatiallyaggregated data International journal of GeograPhicalInformation Systems 10(4) 405-24

DeMers M N 1997 Fundamentals of geograPhic informa-tion systems New York New YorkJohn Wileyamp Sons

Eicher C L 1999 Implementation and evaluation ofdasymetric mapping methods MS Thesis Pennsyl-vania State University

Fisher Pand M Langford 1995 Modeling the errors inareal interpolation between zonal systems by MonteCarlo simulation Environment and Planning A 27211-24

Fisher Pand M Langford 1996 Modeling sensitivity toaccuracy in classified imagery A study of areal inter-polation by dasymetric mapping Professional Geogra-pher 48(3) 299-309

Flowerdew R and M Green 1989 Statistical methodsfor inference between incompatible zonal systemsIn M Goodchild and S Gopal (eds) Accuracy of spa-

Vol 28 No2

tial databases London UK Taylor and Francis pp239-47

Flowerdew R and M Green 1991 Data integrationStatistical methods for transferring data betweenzonal systems In l Masser and M B Blakemore(eds) Handling geographical information Essex UKLongman Scientific amp Technical pp 38-53

Flowerdew R and M Green 1992 Developmentsin areal interpolation methods and GIS Annals ofRegional Science 26 67 -78

Flowerdew R and M Green 1994 Areal interpolationand types of data In S Fotheringham and P Rog-erson (eds) Spatial analysis and GIS London uKTaylor and Francis pp 121-45

Flowerdew R M Green and E Kehris 1991 Usingareal interpolation methods in GIS Papers in RegionalScience The journal of the RSAI 70(3) 303-15

Gerth J 1993 Towards improved spatial analysis withareal units The use of GIS to facilitate the creation ofdasymetric maps Masters Paper Ohio State Univer-sity

Goodchild M F and N S Lam 1980 Areal interpola-tion A variant of the traditional spatial problem Geo-processing I 297-312

Goodchild M F L Anselin and U Deichmann 1993A framework for the areal interpolation of socioeco-nomic data Environment and Planning A 25 383-97

Holloway S R J Schumacher and R Redmond 1996People and place Dasymetric mapping using Arc)Info Missoula Wildlife Spatial Analysis Lab Univer-sity of Montana

Hitt K J 1991 Digital map file of major land uses inthe United States US Geological Survey AccessedOctober 1998 ArcInfo coverage available [romhttpwa ter usgs govG ISmetadatausgswrdllandusehtml

Langford M D J Maguire and D J Unwin 1990Mapping the density of population Continuous sur-face representations as an alternative to choroplethand dasymetric maps Leicester UK MidlandsRegional Research Laboratory University of Leicesterand Loughborough University of Technology UK

Langford M D J Maguire and D J Unwin 1991The Areal interpolation problem Estimating popula-tion using remote sensing in a GIS framework Inl Masser and M B Blakemore (eds) Handling geo-graPhic information Essex UK Longman Scientificamp Technical pp 55-77

Langford M and D J Unwin 1994 Generating andmapping population density surfaces within a geo-graphical information system The Cartographic jour-nal31 21-6

MacEachren A M 1994 Some truth with maps A primeron symbolimtion and design Washington DC Associa-tion of American Geographers

McCleary G FJr 1969 The dasymetric method in the-matic cartography PhD Dissertation University ofWisconsin at Madison

McCleary G FJr 1984 Cartography geography andthe dasymetric method In Proceeding 12th Conferenceof International CartograPhic Association August 6-13Perth Australia 1599-610

137

c Count Error

20000 to 47335

10000 to 20000

5000 to 10000

-5000 to 5000

-10000 to -5000

-20000 to -10000

-382151 to -20000

Figure 6 Maps for traditional limiting variable method (P3) a dasymetric population density map bpercent error and c count error in number of persons

Mennis J 1999 Using GIS spatial statistics and visual-ization to investigate environmental justice Paper pre-sented at the Joint Statistical Meetings American Statisti-cal Association Baltimore Md August 8-12 1999

Monmonier M 1982 Flat laxity optimization androunding in the selection of class intervals Cartograph-ica 19(1) 16-27

Monmonier M and G A Schnell 1984 Land use andland cover data and the mapping of population den-sity International Yearbook of CartograPhy 24 24-9

Moxey A and P Allanson 1994 Areal interpolation ofspatially extensive variables A comparison of alterna-tive techniques International journal of GeograPhicalInformation Systems 8(5) 479-87

138

Robinson A J Morrison P Muehrcke A Kimerlingand S Guptill 1995 Elements of cartograPhy 6th edNew York New York John Wiley amp Sons

Slocum T A 1999 Thematic cartography and visualimtionUpper Saddle River New Jersey Prentice Hall

SPSS Inc 1999 SPSS help venion 90 SPSS Incorpo-rated Chicago Illinois

US Geological Survey 1970National atlas of the United Statesof Ame1Ua us Geological Surveyashington DC

US Geological Survey 1993 National water summary1990-91 Hydrologic events and stream water qualityUS Geological Survey water supply paper 2400 USGeological Survey Washington DC

Wright I K 1936 A method of mapping densities ofpopulation The GeograPhical Review pp 103-10

CartograPhy and Geographic Information Science

Page 8: 50;93. *-88361 -6/ 90-4 (6;09874-;376! (5840506;-;376 … lit...%-:?50;93. *-88361 -6/ "90-4 (6;09874-;376! (5840506;-;376 -6/ &=-4

House ValuesOverall mean ofDasymetric Total Black

Method Population Population $40000middot $50000middot $60000middot CY for method

Ilt$25000 49000 59000 75000

IP3 047 064 054 049 045 049 051

(limiting)P2 062 070 065 064 063 064 065

(3-class) -Pl 062 075 071 067 064 064 067

lbinary)G2 064 080 067 066 065 066 068(3-c1ass) -Gl 063 090 072 068 065 080 073(binary)

Overall mean of 060 076 066 063 061 065 065cv tor variableNotes Entries in the center portion of the table are means of 159 county coefficients of variation (cv) for 30 individual maps Values inthe right-most column are overall means of cv of each dasymetric method and values in the bottom row are overall means of cv ofeach variable

Table 1 Summary of county-level coefficients of variation of RMS error measures

Method Mean CY 95 Confidence IntervalP3 (limiting) 051 043 - 059P2 (3-c1ass) 065 055 - 074Pl (binary) 067 058 - 077

G2 (3-c1ass) 068 058 - 079Gl (binary) 073 064 - 082

tionships error maps will be presented allowingvisual analysis of spatial error patterns Presenta-tion of this type of data in tandem with the dasy-metric maps may generate hypotheses and ideasfor future research

Example Map SeriesFigures45 and 6 present (a) dasymetric maps (b)error maps based on percent error and (c) errormaps based on count error for each of the three

P3P3 P2

PlP2 0001 G2Pl 0000 1000G2 0000 0313 0059Gl 0000 0609 1000 1000

sig nificant differences

Table 3 Probabilities for differences in errors for pairs of dasymetricmethods

and method G1 to G2 thethree-class methods werefound to perform slightlybetter than the binary meth-ods though the overall dif-ference was not significantBased on pairwise compar-ison of method PI to G1 cv = coefficient of variation of RMS county errors

and method P2 to G2 poly- Table 2 Estimated mean errors and confidence intervals for dasymetric methodsgon methods were found toproduce slightly more accu-rate maps than grid methods thoughthe difference was not significant

Many of the papers we reviewed for thiswork especially research on areal inter-polation lacked a visual presentation ofresults Papers with maps (eg Langfordet al 1990 Fisher and Langford 1996Charpentier 1997) typically focused onsmall study areas We now present dasymetricmaps resulting from the three polygon methodstested in this experiment (see also Eicher (1999)for corresponding grid-based maps)

Some of the work in areal interpolation soughtto describe the types of error produced (Cockingset al 1997 Langford et al 1991) For examplethese studies formally tested the connections ofgeometry and attribute characteristics to map errorWhile this section will not formally test these rela-

Visual Analysis of DasymetricMaps and Errors

132 CartograPhy and GeograPhic Information Science

polygon mapping methods The dasymetric mapsare density representations of the total populationvariable Boundaries for all dasymetric zones areshown on these maps Percent error maps showerror relative to the total population of a polygonOne possible weakness of percent error maps isthat rural areas generally will have higher percenterrors because of lower totals within these zonesFurthermore when presented in map form thesehigher errors dominate because rural map zonestend to be larger than urban map zones

Count errors are presented to provide a viewof the absolute error present across the study area(map C in Figures 4 5 and 6) Maps of count errorwill not signal high errors in rural areas with lowpopulations as seen on the percent error mapsWhen count errors are mapped the reader getsmore of an impression of the overall quantity ofthe variable that has been mapped inaccurately Amajor weakness of presenting count data with arealfills however is that larger polygons will tend tohave higher values than smaller polygons with sim-ilar characteristics Despite this recognized prob-lem with mapping counts consistent symbolizationwas maintained to aid comparison with the othertwo map types (a and b) within the figures

Data ranges for classifications are sharedamong corresponding maps in Figures 4 5 and 6to aid comparison The extreme classeshave variedmaximums because the calculated maximum pop-ulation density varied widely depending on themapping method Note that adjacent classes in leg-ends share data breaks namely the upper valuefor the first class (25 personskm2) is the same asthe lower value for the second class the uppervalue for the second class (75 personskm2) is thesame as the lower value for the third class etcEach class should be interpreted as encompassingall values up to but not including the highest valuelisted in the legend for the class This method ofdescribing class breaks allows precise definition ofbreaks without the use of excessiveprecision in sig-nificant digits that clutter a legend and make mapuse difficult (Monmonier 1982)

Binary Method (Pl)The dasymetric map in Figure 4a clearly shows thebinary distinction made in this method betweeninhabitable and uninhabitable areas All areas des-ignated as forested (a large portion of the map)have been assigned zero population The distinc-tion is most obvious in the ridge-and-valley phys-iographic province running diagonally across thestudy region where ridges appear as white (for-ested areas that are assigned no population) and

11Jl 28 No2

valleys are yellow-greens (populated urban andagriculturalwoodland areas)

The percent error map (Figure 4b) as expectedshows under-prediction (grays) for the forestedareas although urban areas have also been under-predicted The latter occurs because urban areasare not distinguished from agriculturalwoodlandareas by the method receiving the same populationdensity when they are in the same county Manyagriculturalwoodland areas are over-predicted forthe same reason especially in areas boundingmajor metropolitan areas (ashington BaltimorePittsburgh) The count error map (Figure 4c) showsslightly different patterns of error Most interest-ing are the relatively low errors in rural forestedareas Though all these areas have -100 percenterror (because they were assigned zero popula-tion) counts are comparatively low because popu-lations are low

Three-Class Method (P2)The dasymetric map in Figure 5a shows that nearlyall areas received population with the exception ofwater The map also distinguishes between urbanand agriculturalwoodland and thus urban placeslike York Erie and Harrisburg (see Figure 1 forlocations) have visiblyhigher populations than thesurrounding rural areas

The percent error map (Figure 5b) shows thaturban areas are either predicted correctly or under-predicted (eg Pittsburgh Washington Baltimore)This map also shows that forested areas are under-predicted especially in the ridge-and-valley prov-ince Conversely the valleys typically classifiedas agriculturalwoodland are over-predicted Theunder-prediction of forested areas in most countiessuggests that not enough population was assignedto the forested land use in the 70-20-10 percentweighting scheme Again the map of count error(Figure 5c) shows lower errors throughout theridge-and-valley province because populations arelower in the mostly rural area

Limiting Variable Method (P3)The dasymetric map produced using the P3method (Figure 6a) which had the lowest overallerror closely resembles the result from method P2which had the next best error measures Slight dif-ferences in dasymetric map patterns for the twomethods are noticeable in northwest Pennsylvaniaalong the top edge of the study area as well as incentral Virginia along the southern edge of themap The limiting variable method is also similarto method P2 in that few areas of the map havezero population

133

a Population Density b Percent Error c Count Error(persons per km2)

050 to 37233 20000 to 38852

25 to 50 10000 to 20000o to 25

10 to 25 5000 to 1000025 to 75

-10 to 10 -5000 to 5000

75 to 175-25 to -10 -10000 to -5000

175 to 375 -50 to -25 -20000 to -10000

375 to 13551 -100 to -50 -395877 to -20000

Figure 4 Maps for binary method (Pl) a dasymetric population density map b percent error and c counterror in number of persons

The pattern of error for the P3 method (Fig-ures 6b and 6c) is different however from thefirst two methods An interesting radial pattern isevident around both Washington and BaltimoreThese areas are unique because they have large met-ropolitan populations and the municipal boundar-ies of each designate them as independent citiesessentially small counties in this analysis so thesecentral cities are well predicted Immediately out-ward however are over-predicted areas followedby a ring of under-predicted areas at the fringesof the metropolitan areas The boundary betweenover-prediction and under-prediction is the func-tional border between urban and agriculturalwoodland land use The Pittsburgh area has asimilar pattern with a central urban area that iswell predicted surrounded by areas of agriculturalwoodland land use that are under-predicted

134

For major metropolitan areas we concludethat the threshold value for agriculturalwoodlandareas was set too low As a result in the case ofWashington the outermost suburbs (classified asagriculturalwoodland) received the threshold den-sity of 15 persons per km2 while too much datawas then carried over and assigned to more trulysuburban areas (classified as urban) closer to thecity center In the case of Pittsburgh a smallercity suburbs immediately proximate to the citywere classified as agriculturalwoodland assignedthe corresponding agriculturalwoodland densitythreshold and therefore under-predicted The ageof the land-use data could also be a factor inboth cases because areas classified as agriculturalwoodland may have been developed between 1970and 1990

CartograPhy and GeograPhic Information Science

a Population Density b Percent Error c Count Error(persons per km2)

050 to 127028 20000 to 26418

25 to 50 10000 to 20000o to 25

10 to 25 5000 to 1000025 to 75

-10 to 10 -5000 to 5000

75 to 175-25to-10 -10000 to -5000

175 to 375 -50 to -25 -20000 to -10000

375 to 16525064 -100 to -50 -288114 to -20000

Figure 5 Maps for three-class method (P2) a dasymetric population density map b percent error and ccount error in number of persons

Conclusions and Future Research

Research Question A Mapping MethodsThere were significant differences in accuracy forthe five dasymetric mapping methods tested inthis experiment The limiting variable method(P3) produced maps with significantly lower errorthan maps made using the other methods Three-class methods (P2 and G2) produced maps witherrors that were not significantly different thanthe errors of maps produced with binary methods(PI and GI)

Though the traditional limiting variablemethod (P3) is over a half-century old (Wright1936) we have shown that it can be implementedin a GIS to produce accurate dasymetric mapsThe success of this method may in part be due

11il28 No2

to its customized approach Threshold values usedto shift data between zones were based on thedata distribution for each mapped variable Thisapproach contrasts with three-class methods P2and G2 in which the same 70-20-10 percentageweightings were applied to all variables Futurework could examine the effect of using differentthreshold values for the land-use classesAlso impor-tant are methods of determining the P3 thresholdvalues Perhaps the reassignment schemes used forP3 could also be made less arbitrary and morespecific to the geography of each source zone bymaking the thresholds dependent on the land-useareas within each source zone (Mennis 1999) Forexample a threshold of 20 people per square kilo-meter used for agriculturaVwoodland areas couldbe increased to 30 in counties with more than 50

135

percent urban land uses In the future a standard-ized and generalizable decision process with a sta-tistical basis might be developed for this purpose

Research Question B Polygon Methodsversus Grid MethodsBased on a pairwise comparison of polygon andgrid versions of the binary method (PI and GI)and the three-class method (P2 and G2) errors forpolygon dasymetric maps were not significantlydifferent than for grid maps In our implementa-tion the use of grid methods was more convenientWithin the ArcView environment grid (raster) ver-sions of our datasets took up less storage spaceand the grid methods were easier to implementusing Avenue scripting Processing was also fasterin ArcViewbecause shorter scripts were typical ofthe grid methods (Eicher 1999) Grid methodswere developed in the areal interpolation litera-ture but have not been popular for dasymetricmapping Their effectiveness in producing dasy-metric maps in this experiment as well as theirease of implementation suggests that they shouldbe used more frequently In contrast advantagesof the equally accurate polygon methods includegreater flexibility for publication-quality mappingsuch as ease of adjusting line weights to suit differ-ent presentation sizes

Future work could study the effect of grid cellsize on the accuracy of dasymetric maps The rela-tionship between cell size and map scale is espe-cially interesting A research project focusing ongrid dasymetric maps could determine optimalgrid cell sizes for varied map scales For examplewe expect that use of excessively small grid cellsfor a given scale would lead to higher map errorSuch research would also provide an analyticalmethod of balancing increases in accuracy affordedby smaller grid cells with the increased storagerequirements and processing times associated withlarger data files See Eicher (1999) for further dis-cussion of this topic

Research Question C Dasymetric Mapsand Error MapsVisual inspection of dasymetric maps and error mapssupported the results from the statisticalanalysispor-tion of the experiment and offered additional insightinto the usefulness of dasymetric maps It is our opin-ion that the resulting dasymetric maps are visuallyeffective showing more accurate distributions thancounty aggregations provide

Classification (the data ranges represented bymap colors) for dasymetric maps was not closely

136

examined in this work Future work could investi-gate whether classifications that have been shownto work well for choropleth maps also work wellfor dasymetric maps An interesting possibility isthe use of contiguity-based classifications (Cromley1996) which might make dasymetric maps visually

more dasymetric Contiguity-based classificationsconsider differences across boundaries betweenadjacent polygons and this strategy is particularlywell suited for application to boundaries on dasy-metric maps Dasymetric boundaries are moreclosely related to the underlying data distributionthan boundaries on choropleth maps because dasy-metric zones are more internally homogeneous

Many possibilities exist for extending calcula-tion and comparison of error in dasymetric mapsFuture work could include a number of between-county factors or covariates in the repeated-mea-sures ANOVAThese could test the relationshipbetween map error and land-use areas popula-tion densitygeographic location and other factorsAlso different approaches could be taken to evalu-ating the quality of dasymetric maps in an exper-iment A defining property of dasymetric mapsis homogeneity within map zones however thisproperty was not formally tested in this researchIt may be possible to compute the variance foreach map zone using values from the block groupscomprising that zone Variances within each mapzone could then be compared to variances betweenmap zones to establish a measure of dasymetricity(homogeneity within zones combined with distinct-ness of differences between adjacent zones) To fur-ther examine dasymetric mapping methods andthe properties of dasymetric maps a number ofdifferent map zone configurations could be usedFor example maps resulting from different ancil-lary variables such as land use soil types or slopezones could be compared Likewisemaps resultingfrom different source zones such as counties votingdistricts and economic planning areas could becompared

SummaryWith a structured experiment we determinedthat the traditional limiting variable method wasthe most accurate of those tested and that griddasymetric mapping methods produced maps ofsimilar accuracy to their polygonal counterpartsWrights limiting variable concept developed 65years ago has maintained its utility This tradi-tional method has been updated and adapted suc-cessfully to produce dasymetric maps in a GISmapping environment Additionally the availabil-

CartogmPhy and Geogmphic Information Science

ity of GIS technology has made data processingfor dasymetric map production a more pragmaticprocess Work (such as this paper) on dasymetricmapping research on areal interpolation andincreased use of GIS by cartographers are firststeps toward producing a formal set of dasymetricmapping methods that updates George McClearys32-year-old typology GIS software packages mayeventually offer full dasymetric mapping function-ality or at least provide the means to make thesetypes of maps more easily

ACKNOWLEDGEMENTSSpecial thanks must be given to Jeremy Mennisand Donna Peuquet for their comments and con-structive criticism on this research We appreciateTerry Slocums careful edits of the manuscript

REFERENCESAnderson J R 1967 Major land uses in the United

States In US Geological SurveyNational atlas of theUnited States of America U S Geological SurveyWash-ington DC pp 158-9

Bloom L M PJ Pedler and G E Wragg 1996 Imple-mentation of enhanced areal interpolation using Map-info Computers amp Geosciences 22(5) 459-66

Charpentier M 1997 The dasymetric method with com-ments on its use in GIS In 1997 UCGISAnnualAssem-bly amp Summer Retreat The University Consortium forGeographic Information Science June 15-20 BarHarbor Maine

Chrisman N R 1997 Exploring geograPhic informationsystems New York New YorkJohn Wileyamp Sons

Chrisman N R 1998 Rethinking levels of measure-ment for cartography Cartography and GeograPhicInformation Systems 25(4) 231-42

Cockings S P Fisher and M Langford 1997 Param-eterization and visualization of the errors in arealinterpolation GeograPhical Analysis 29(4) 314-328

Cromley R G 1996 A comparison of optimal classifica-tion strategies for choroplethic displays of spatiallyaggregated data International journal of GeograPhicalInformation Systems 10(4) 405-24

DeMers M N 1997 Fundamentals of geograPhic informa-tion systems New York New YorkJohn Wileyamp Sons

Eicher C L 1999 Implementation and evaluation ofdasymetric mapping methods MS Thesis Pennsyl-vania State University

Fisher Pand M Langford 1995 Modeling the errors inareal interpolation between zonal systems by MonteCarlo simulation Environment and Planning A 27211-24

Fisher Pand M Langford 1996 Modeling sensitivity toaccuracy in classified imagery A study of areal inter-polation by dasymetric mapping Professional Geogra-pher 48(3) 299-309

Flowerdew R and M Green 1989 Statistical methodsfor inference between incompatible zonal systemsIn M Goodchild and S Gopal (eds) Accuracy of spa-

Vol 28 No2

tial databases London UK Taylor and Francis pp239-47

Flowerdew R and M Green 1991 Data integrationStatistical methods for transferring data betweenzonal systems In l Masser and M B Blakemore(eds) Handling geographical information Essex UKLongman Scientific amp Technical pp 38-53

Flowerdew R and M Green 1992 Developmentsin areal interpolation methods and GIS Annals ofRegional Science 26 67 -78

Flowerdew R and M Green 1994 Areal interpolationand types of data In S Fotheringham and P Rog-erson (eds) Spatial analysis and GIS London uKTaylor and Francis pp 121-45

Flowerdew R M Green and E Kehris 1991 Usingareal interpolation methods in GIS Papers in RegionalScience The journal of the RSAI 70(3) 303-15

Gerth J 1993 Towards improved spatial analysis withareal units The use of GIS to facilitate the creation ofdasymetric maps Masters Paper Ohio State Univer-sity

Goodchild M F and N S Lam 1980 Areal interpola-tion A variant of the traditional spatial problem Geo-processing I 297-312

Goodchild M F L Anselin and U Deichmann 1993A framework for the areal interpolation of socioeco-nomic data Environment and Planning A 25 383-97

Holloway S R J Schumacher and R Redmond 1996People and place Dasymetric mapping using Arc)Info Missoula Wildlife Spatial Analysis Lab Univer-sity of Montana

Hitt K J 1991 Digital map file of major land uses inthe United States US Geological Survey AccessedOctober 1998 ArcInfo coverage available [romhttpwa ter usgs govG ISmetadatausgswrdllandusehtml

Langford M D J Maguire and D J Unwin 1990Mapping the density of population Continuous sur-face representations as an alternative to choroplethand dasymetric maps Leicester UK MidlandsRegional Research Laboratory University of Leicesterand Loughborough University of Technology UK

Langford M D J Maguire and D J Unwin 1991The Areal interpolation problem Estimating popula-tion using remote sensing in a GIS framework Inl Masser and M B Blakemore (eds) Handling geo-graPhic information Essex UK Longman Scientificamp Technical pp 55-77

Langford M and D J Unwin 1994 Generating andmapping population density surfaces within a geo-graphical information system The Cartographic jour-nal31 21-6

MacEachren A M 1994 Some truth with maps A primeron symbolimtion and design Washington DC Associa-tion of American Geographers

McCleary G FJr 1969 The dasymetric method in the-matic cartography PhD Dissertation University ofWisconsin at Madison

McCleary G FJr 1984 Cartography geography andthe dasymetric method In Proceeding 12th Conferenceof International CartograPhic Association August 6-13Perth Australia 1599-610

137

c Count Error

20000 to 47335

10000 to 20000

5000 to 10000

-5000 to 5000

-10000 to -5000

-20000 to -10000

-382151 to -20000

Figure 6 Maps for traditional limiting variable method (P3) a dasymetric population density map bpercent error and c count error in number of persons

Mennis J 1999 Using GIS spatial statistics and visual-ization to investigate environmental justice Paper pre-sented at the Joint Statistical Meetings American Statisti-cal Association Baltimore Md August 8-12 1999

Monmonier M 1982 Flat laxity optimization androunding in the selection of class intervals Cartograph-ica 19(1) 16-27

Monmonier M and G A Schnell 1984 Land use andland cover data and the mapping of population den-sity International Yearbook of CartograPhy 24 24-9

Moxey A and P Allanson 1994 Areal interpolation ofspatially extensive variables A comparison of alterna-tive techniques International journal of GeograPhicalInformation Systems 8(5) 479-87

138

Robinson A J Morrison P Muehrcke A Kimerlingand S Guptill 1995 Elements of cartograPhy 6th edNew York New York John Wiley amp Sons

Slocum T A 1999 Thematic cartography and visualimtionUpper Saddle River New Jersey Prentice Hall

SPSS Inc 1999 SPSS help venion 90 SPSS Incorpo-rated Chicago Illinois

US Geological Survey 1970National atlas of the United Statesof Ame1Ua us Geological Surveyashington DC

US Geological Survey 1993 National water summary1990-91 Hydrologic events and stream water qualityUS Geological Survey water supply paper 2400 USGeological Survey Washington DC

Wright I K 1936 A method of mapping densities ofpopulation The GeograPhical Review pp 103-10

CartograPhy and Geographic Information Science

Page 9: 50;93. *-88361 -6/ 90-4 (6;09874-;376! (5840506;-;376 … lit...%-:?50;93. *-88361 -6/ "90-4 (6;09874-;376! (5840506;-;376 -6/ &=-4

polygon mapping methods The dasymetric mapsare density representations of the total populationvariable Boundaries for all dasymetric zones areshown on these maps Percent error maps showerror relative to the total population of a polygonOne possible weakness of percent error maps isthat rural areas generally will have higher percenterrors because of lower totals within these zonesFurthermore when presented in map form thesehigher errors dominate because rural map zonestend to be larger than urban map zones

Count errors are presented to provide a viewof the absolute error present across the study area(map C in Figures 4 5 and 6) Maps of count errorwill not signal high errors in rural areas with lowpopulations as seen on the percent error mapsWhen count errors are mapped the reader getsmore of an impression of the overall quantity ofthe variable that has been mapped inaccurately Amajor weakness of presenting count data with arealfills however is that larger polygons will tend tohave higher values than smaller polygons with sim-ilar characteristics Despite this recognized prob-lem with mapping counts consistent symbolizationwas maintained to aid comparison with the othertwo map types (a and b) within the figures

Data ranges for classifications are sharedamong corresponding maps in Figures 4 5 and 6to aid comparison The extreme classeshave variedmaximums because the calculated maximum pop-ulation density varied widely depending on themapping method Note that adjacent classes in leg-ends share data breaks namely the upper valuefor the first class (25 personskm2) is the same asthe lower value for the second class the uppervalue for the second class (75 personskm2) is thesame as the lower value for the third class etcEach class should be interpreted as encompassingall values up to but not including the highest valuelisted in the legend for the class This method ofdescribing class breaks allows precise definition ofbreaks without the use of excessiveprecision in sig-nificant digits that clutter a legend and make mapuse difficult (Monmonier 1982)

Binary Method (Pl)The dasymetric map in Figure 4a clearly shows thebinary distinction made in this method betweeninhabitable and uninhabitable areas All areas des-ignated as forested (a large portion of the map)have been assigned zero population The distinc-tion is most obvious in the ridge-and-valley phys-iographic province running diagonally across thestudy region where ridges appear as white (for-ested areas that are assigned no population) and

11Jl 28 No2

valleys are yellow-greens (populated urban andagriculturalwoodland areas)

The percent error map (Figure 4b) as expectedshows under-prediction (grays) for the forestedareas although urban areas have also been under-predicted The latter occurs because urban areasare not distinguished from agriculturalwoodlandareas by the method receiving the same populationdensity when they are in the same county Manyagriculturalwoodland areas are over-predicted forthe same reason especially in areas boundingmajor metropolitan areas (ashington BaltimorePittsburgh) The count error map (Figure 4c) showsslightly different patterns of error Most interest-ing are the relatively low errors in rural forestedareas Though all these areas have -100 percenterror (because they were assigned zero popula-tion) counts are comparatively low because popu-lations are low

Three-Class Method (P2)The dasymetric map in Figure 5a shows that nearlyall areas received population with the exception ofwater The map also distinguishes between urbanand agriculturalwoodland and thus urban placeslike York Erie and Harrisburg (see Figure 1 forlocations) have visiblyhigher populations than thesurrounding rural areas

The percent error map (Figure 5b) shows thaturban areas are either predicted correctly or under-predicted (eg Pittsburgh Washington Baltimore)This map also shows that forested areas are under-predicted especially in the ridge-and-valley prov-ince Conversely the valleys typically classifiedas agriculturalwoodland are over-predicted Theunder-prediction of forested areas in most countiessuggests that not enough population was assignedto the forested land use in the 70-20-10 percentweighting scheme Again the map of count error(Figure 5c) shows lower errors throughout theridge-and-valley province because populations arelower in the mostly rural area

Limiting Variable Method (P3)The dasymetric map produced using the P3method (Figure 6a) which had the lowest overallerror closely resembles the result from method P2which had the next best error measures Slight dif-ferences in dasymetric map patterns for the twomethods are noticeable in northwest Pennsylvaniaalong the top edge of the study area as well as incentral Virginia along the southern edge of themap The limiting variable method is also similarto method P2 in that few areas of the map havezero population

133

a Population Density b Percent Error c Count Error(persons per km2)

050 to 37233 20000 to 38852

25 to 50 10000 to 20000o to 25

10 to 25 5000 to 1000025 to 75

-10 to 10 -5000 to 5000

75 to 175-25 to -10 -10000 to -5000

175 to 375 -50 to -25 -20000 to -10000

375 to 13551 -100 to -50 -395877 to -20000

Figure 4 Maps for binary method (Pl) a dasymetric population density map b percent error and c counterror in number of persons

The pattern of error for the P3 method (Fig-ures 6b and 6c) is different however from thefirst two methods An interesting radial pattern isevident around both Washington and BaltimoreThese areas are unique because they have large met-ropolitan populations and the municipal boundar-ies of each designate them as independent citiesessentially small counties in this analysis so thesecentral cities are well predicted Immediately out-ward however are over-predicted areas followedby a ring of under-predicted areas at the fringesof the metropolitan areas The boundary betweenover-prediction and under-prediction is the func-tional border between urban and agriculturalwoodland land use The Pittsburgh area has asimilar pattern with a central urban area that iswell predicted surrounded by areas of agriculturalwoodland land use that are under-predicted

134

For major metropolitan areas we concludethat the threshold value for agriculturalwoodlandareas was set too low As a result in the case ofWashington the outermost suburbs (classified asagriculturalwoodland) received the threshold den-sity of 15 persons per km2 while too much datawas then carried over and assigned to more trulysuburban areas (classified as urban) closer to thecity center In the case of Pittsburgh a smallercity suburbs immediately proximate to the citywere classified as agriculturalwoodland assignedthe corresponding agriculturalwoodland densitythreshold and therefore under-predicted The ageof the land-use data could also be a factor inboth cases because areas classified as agriculturalwoodland may have been developed between 1970and 1990

CartograPhy and GeograPhic Information Science

a Population Density b Percent Error c Count Error(persons per km2)

050 to 127028 20000 to 26418

25 to 50 10000 to 20000o to 25

10 to 25 5000 to 1000025 to 75

-10 to 10 -5000 to 5000

75 to 175-25to-10 -10000 to -5000

175 to 375 -50 to -25 -20000 to -10000

375 to 16525064 -100 to -50 -288114 to -20000

Figure 5 Maps for three-class method (P2) a dasymetric population density map b percent error and ccount error in number of persons

Conclusions and Future Research

Research Question A Mapping MethodsThere were significant differences in accuracy forthe five dasymetric mapping methods tested inthis experiment The limiting variable method(P3) produced maps with significantly lower errorthan maps made using the other methods Three-class methods (P2 and G2) produced maps witherrors that were not significantly different thanthe errors of maps produced with binary methods(PI and GI)

Though the traditional limiting variablemethod (P3) is over a half-century old (Wright1936) we have shown that it can be implementedin a GIS to produce accurate dasymetric mapsThe success of this method may in part be due

11il28 No2

to its customized approach Threshold values usedto shift data between zones were based on thedata distribution for each mapped variable Thisapproach contrasts with three-class methods P2and G2 in which the same 70-20-10 percentageweightings were applied to all variables Futurework could examine the effect of using differentthreshold values for the land-use classesAlso impor-tant are methods of determining the P3 thresholdvalues Perhaps the reassignment schemes used forP3 could also be made less arbitrary and morespecific to the geography of each source zone bymaking the thresholds dependent on the land-useareas within each source zone (Mennis 1999) Forexample a threshold of 20 people per square kilo-meter used for agriculturaVwoodland areas couldbe increased to 30 in counties with more than 50

135

percent urban land uses In the future a standard-ized and generalizable decision process with a sta-tistical basis might be developed for this purpose

Research Question B Polygon Methodsversus Grid MethodsBased on a pairwise comparison of polygon andgrid versions of the binary method (PI and GI)and the three-class method (P2 and G2) errors forpolygon dasymetric maps were not significantlydifferent than for grid maps In our implementa-tion the use of grid methods was more convenientWithin the ArcView environment grid (raster) ver-sions of our datasets took up less storage spaceand the grid methods were easier to implementusing Avenue scripting Processing was also fasterin ArcViewbecause shorter scripts were typical ofthe grid methods (Eicher 1999) Grid methodswere developed in the areal interpolation litera-ture but have not been popular for dasymetricmapping Their effectiveness in producing dasy-metric maps in this experiment as well as theirease of implementation suggests that they shouldbe used more frequently In contrast advantagesof the equally accurate polygon methods includegreater flexibility for publication-quality mappingsuch as ease of adjusting line weights to suit differ-ent presentation sizes

Future work could study the effect of grid cellsize on the accuracy of dasymetric maps The rela-tionship between cell size and map scale is espe-cially interesting A research project focusing ongrid dasymetric maps could determine optimalgrid cell sizes for varied map scales For examplewe expect that use of excessively small grid cellsfor a given scale would lead to higher map errorSuch research would also provide an analyticalmethod of balancing increases in accuracy affordedby smaller grid cells with the increased storagerequirements and processing times associated withlarger data files See Eicher (1999) for further dis-cussion of this topic

Research Question C Dasymetric Mapsand Error MapsVisual inspection of dasymetric maps and error mapssupported the results from the statisticalanalysispor-tion of the experiment and offered additional insightinto the usefulness of dasymetric maps It is our opin-ion that the resulting dasymetric maps are visuallyeffective showing more accurate distributions thancounty aggregations provide

Classification (the data ranges represented bymap colors) for dasymetric maps was not closely

136

examined in this work Future work could investi-gate whether classifications that have been shownto work well for choropleth maps also work wellfor dasymetric maps An interesting possibility isthe use of contiguity-based classifications (Cromley1996) which might make dasymetric maps visually

more dasymetric Contiguity-based classificationsconsider differences across boundaries betweenadjacent polygons and this strategy is particularlywell suited for application to boundaries on dasy-metric maps Dasymetric boundaries are moreclosely related to the underlying data distributionthan boundaries on choropleth maps because dasy-metric zones are more internally homogeneous

Many possibilities exist for extending calcula-tion and comparison of error in dasymetric mapsFuture work could include a number of between-county factors or covariates in the repeated-mea-sures ANOVAThese could test the relationshipbetween map error and land-use areas popula-tion densitygeographic location and other factorsAlso different approaches could be taken to evalu-ating the quality of dasymetric maps in an exper-iment A defining property of dasymetric mapsis homogeneity within map zones however thisproperty was not formally tested in this researchIt may be possible to compute the variance foreach map zone using values from the block groupscomprising that zone Variances within each mapzone could then be compared to variances betweenmap zones to establish a measure of dasymetricity(homogeneity within zones combined with distinct-ness of differences between adjacent zones) To fur-ther examine dasymetric mapping methods andthe properties of dasymetric maps a number ofdifferent map zone configurations could be usedFor example maps resulting from different ancil-lary variables such as land use soil types or slopezones could be compared Likewisemaps resultingfrom different source zones such as counties votingdistricts and economic planning areas could becompared

SummaryWith a structured experiment we determinedthat the traditional limiting variable method wasthe most accurate of those tested and that griddasymetric mapping methods produced maps ofsimilar accuracy to their polygonal counterpartsWrights limiting variable concept developed 65years ago has maintained its utility This tradi-tional method has been updated and adapted suc-cessfully to produce dasymetric maps in a GISmapping environment Additionally the availabil-

CartogmPhy and Geogmphic Information Science

ity of GIS technology has made data processingfor dasymetric map production a more pragmaticprocess Work (such as this paper) on dasymetricmapping research on areal interpolation andincreased use of GIS by cartographers are firststeps toward producing a formal set of dasymetricmapping methods that updates George McClearys32-year-old typology GIS software packages mayeventually offer full dasymetric mapping function-ality or at least provide the means to make thesetypes of maps more easily

ACKNOWLEDGEMENTSSpecial thanks must be given to Jeremy Mennisand Donna Peuquet for their comments and con-structive criticism on this research We appreciateTerry Slocums careful edits of the manuscript

REFERENCESAnderson J R 1967 Major land uses in the United

States In US Geological SurveyNational atlas of theUnited States of America U S Geological SurveyWash-ington DC pp 158-9

Bloom L M PJ Pedler and G E Wragg 1996 Imple-mentation of enhanced areal interpolation using Map-info Computers amp Geosciences 22(5) 459-66

Charpentier M 1997 The dasymetric method with com-ments on its use in GIS In 1997 UCGISAnnualAssem-bly amp Summer Retreat The University Consortium forGeographic Information Science June 15-20 BarHarbor Maine

Chrisman N R 1997 Exploring geograPhic informationsystems New York New YorkJohn Wileyamp Sons

Chrisman N R 1998 Rethinking levels of measure-ment for cartography Cartography and GeograPhicInformation Systems 25(4) 231-42

Cockings S P Fisher and M Langford 1997 Param-eterization and visualization of the errors in arealinterpolation GeograPhical Analysis 29(4) 314-328

Cromley R G 1996 A comparison of optimal classifica-tion strategies for choroplethic displays of spatiallyaggregated data International journal of GeograPhicalInformation Systems 10(4) 405-24

DeMers M N 1997 Fundamentals of geograPhic informa-tion systems New York New YorkJohn Wileyamp Sons

Eicher C L 1999 Implementation and evaluation ofdasymetric mapping methods MS Thesis Pennsyl-vania State University

Fisher Pand M Langford 1995 Modeling the errors inareal interpolation between zonal systems by MonteCarlo simulation Environment and Planning A 27211-24

Fisher Pand M Langford 1996 Modeling sensitivity toaccuracy in classified imagery A study of areal inter-polation by dasymetric mapping Professional Geogra-pher 48(3) 299-309

Flowerdew R and M Green 1989 Statistical methodsfor inference between incompatible zonal systemsIn M Goodchild and S Gopal (eds) Accuracy of spa-

Vol 28 No2

tial databases London UK Taylor and Francis pp239-47

Flowerdew R and M Green 1991 Data integrationStatistical methods for transferring data betweenzonal systems In l Masser and M B Blakemore(eds) Handling geographical information Essex UKLongman Scientific amp Technical pp 38-53

Flowerdew R and M Green 1992 Developmentsin areal interpolation methods and GIS Annals ofRegional Science 26 67 -78

Flowerdew R and M Green 1994 Areal interpolationand types of data In S Fotheringham and P Rog-erson (eds) Spatial analysis and GIS London uKTaylor and Francis pp 121-45

Flowerdew R M Green and E Kehris 1991 Usingareal interpolation methods in GIS Papers in RegionalScience The journal of the RSAI 70(3) 303-15

Gerth J 1993 Towards improved spatial analysis withareal units The use of GIS to facilitate the creation ofdasymetric maps Masters Paper Ohio State Univer-sity

Goodchild M F and N S Lam 1980 Areal interpola-tion A variant of the traditional spatial problem Geo-processing I 297-312

Goodchild M F L Anselin and U Deichmann 1993A framework for the areal interpolation of socioeco-nomic data Environment and Planning A 25 383-97

Holloway S R J Schumacher and R Redmond 1996People and place Dasymetric mapping using Arc)Info Missoula Wildlife Spatial Analysis Lab Univer-sity of Montana

Hitt K J 1991 Digital map file of major land uses inthe United States US Geological Survey AccessedOctober 1998 ArcInfo coverage available [romhttpwa ter usgs govG ISmetadatausgswrdllandusehtml

Langford M D J Maguire and D J Unwin 1990Mapping the density of population Continuous sur-face representations as an alternative to choroplethand dasymetric maps Leicester UK MidlandsRegional Research Laboratory University of Leicesterand Loughborough University of Technology UK

Langford M D J Maguire and D J Unwin 1991The Areal interpolation problem Estimating popula-tion using remote sensing in a GIS framework Inl Masser and M B Blakemore (eds) Handling geo-graPhic information Essex UK Longman Scientificamp Technical pp 55-77

Langford M and D J Unwin 1994 Generating andmapping population density surfaces within a geo-graphical information system The Cartographic jour-nal31 21-6

MacEachren A M 1994 Some truth with maps A primeron symbolimtion and design Washington DC Associa-tion of American Geographers

McCleary G FJr 1969 The dasymetric method in the-matic cartography PhD Dissertation University ofWisconsin at Madison

McCleary G FJr 1984 Cartography geography andthe dasymetric method In Proceeding 12th Conferenceof International CartograPhic Association August 6-13Perth Australia 1599-610

137

c Count Error

20000 to 47335

10000 to 20000

5000 to 10000

-5000 to 5000

-10000 to -5000

-20000 to -10000

-382151 to -20000

Figure 6 Maps for traditional limiting variable method (P3) a dasymetric population density map bpercent error and c count error in number of persons

Mennis J 1999 Using GIS spatial statistics and visual-ization to investigate environmental justice Paper pre-sented at the Joint Statistical Meetings American Statisti-cal Association Baltimore Md August 8-12 1999

Monmonier M 1982 Flat laxity optimization androunding in the selection of class intervals Cartograph-ica 19(1) 16-27

Monmonier M and G A Schnell 1984 Land use andland cover data and the mapping of population den-sity International Yearbook of CartograPhy 24 24-9

Moxey A and P Allanson 1994 Areal interpolation ofspatially extensive variables A comparison of alterna-tive techniques International journal of GeograPhicalInformation Systems 8(5) 479-87

138

Robinson A J Morrison P Muehrcke A Kimerlingand S Guptill 1995 Elements of cartograPhy 6th edNew York New York John Wiley amp Sons

Slocum T A 1999 Thematic cartography and visualimtionUpper Saddle River New Jersey Prentice Hall

SPSS Inc 1999 SPSS help venion 90 SPSS Incorpo-rated Chicago Illinois

US Geological Survey 1970National atlas of the United Statesof Ame1Ua us Geological Surveyashington DC

US Geological Survey 1993 National water summary1990-91 Hydrologic events and stream water qualityUS Geological Survey water supply paper 2400 USGeological Survey Washington DC

Wright I K 1936 A method of mapping densities ofpopulation The GeograPhical Review pp 103-10

CartograPhy and Geographic Information Science

Page 10: 50;93. *-88361 -6/ 90-4 (6;09874-;376! (5840506;-;376 … lit...%-:?50;93. *-88361 -6/ "90-4 (6;09874-;376! (5840506;-;376 -6/ &=-4

a Population Density b Percent Error c Count Error(persons per km2)

050 to 37233 20000 to 38852

25 to 50 10000 to 20000o to 25

10 to 25 5000 to 1000025 to 75

-10 to 10 -5000 to 5000

75 to 175-25 to -10 -10000 to -5000

175 to 375 -50 to -25 -20000 to -10000

375 to 13551 -100 to -50 -395877 to -20000

Figure 4 Maps for binary method (Pl) a dasymetric population density map b percent error and c counterror in number of persons

The pattern of error for the P3 method (Fig-ures 6b and 6c) is different however from thefirst two methods An interesting radial pattern isevident around both Washington and BaltimoreThese areas are unique because they have large met-ropolitan populations and the municipal boundar-ies of each designate them as independent citiesessentially small counties in this analysis so thesecentral cities are well predicted Immediately out-ward however are over-predicted areas followedby a ring of under-predicted areas at the fringesof the metropolitan areas The boundary betweenover-prediction and under-prediction is the func-tional border between urban and agriculturalwoodland land use The Pittsburgh area has asimilar pattern with a central urban area that iswell predicted surrounded by areas of agriculturalwoodland land use that are under-predicted

134

For major metropolitan areas we concludethat the threshold value for agriculturalwoodlandareas was set too low As a result in the case ofWashington the outermost suburbs (classified asagriculturalwoodland) received the threshold den-sity of 15 persons per km2 while too much datawas then carried over and assigned to more trulysuburban areas (classified as urban) closer to thecity center In the case of Pittsburgh a smallercity suburbs immediately proximate to the citywere classified as agriculturalwoodland assignedthe corresponding agriculturalwoodland densitythreshold and therefore under-predicted The ageof the land-use data could also be a factor inboth cases because areas classified as agriculturalwoodland may have been developed between 1970and 1990

CartograPhy and GeograPhic Information Science

a Population Density b Percent Error c Count Error(persons per km2)

050 to 127028 20000 to 26418

25 to 50 10000 to 20000o to 25

10 to 25 5000 to 1000025 to 75

-10 to 10 -5000 to 5000

75 to 175-25to-10 -10000 to -5000

175 to 375 -50 to -25 -20000 to -10000

375 to 16525064 -100 to -50 -288114 to -20000

Figure 5 Maps for three-class method (P2) a dasymetric population density map b percent error and ccount error in number of persons

Conclusions and Future Research

Research Question A Mapping MethodsThere were significant differences in accuracy forthe five dasymetric mapping methods tested inthis experiment The limiting variable method(P3) produced maps with significantly lower errorthan maps made using the other methods Three-class methods (P2 and G2) produced maps witherrors that were not significantly different thanthe errors of maps produced with binary methods(PI and GI)

Though the traditional limiting variablemethod (P3) is over a half-century old (Wright1936) we have shown that it can be implementedin a GIS to produce accurate dasymetric mapsThe success of this method may in part be due

11il28 No2

to its customized approach Threshold values usedto shift data between zones were based on thedata distribution for each mapped variable Thisapproach contrasts with three-class methods P2and G2 in which the same 70-20-10 percentageweightings were applied to all variables Futurework could examine the effect of using differentthreshold values for the land-use classesAlso impor-tant are methods of determining the P3 thresholdvalues Perhaps the reassignment schemes used forP3 could also be made less arbitrary and morespecific to the geography of each source zone bymaking the thresholds dependent on the land-useareas within each source zone (Mennis 1999) Forexample a threshold of 20 people per square kilo-meter used for agriculturaVwoodland areas couldbe increased to 30 in counties with more than 50

135

percent urban land uses In the future a standard-ized and generalizable decision process with a sta-tistical basis might be developed for this purpose

Research Question B Polygon Methodsversus Grid MethodsBased on a pairwise comparison of polygon andgrid versions of the binary method (PI and GI)and the three-class method (P2 and G2) errors forpolygon dasymetric maps were not significantlydifferent than for grid maps In our implementa-tion the use of grid methods was more convenientWithin the ArcView environment grid (raster) ver-sions of our datasets took up less storage spaceand the grid methods were easier to implementusing Avenue scripting Processing was also fasterin ArcViewbecause shorter scripts were typical ofthe grid methods (Eicher 1999) Grid methodswere developed in the areal interpolation litera-ture but have not been popular for dasymetricmapping Their effectiveness in producing dasy-metric maps in this experiment as well as theirease of implementation suggests that they shouldbe used more frequently In contrast advantagesof the equally accurate polygon methods includegreater flexibility for publication-quality mappingsuch as ease of adjusting line weights to suit differ-ent presentation sizes

Future work could study the effect of grid cellsize on the accuracy of dasymetric maps The rela-tionship between cell size and map scale is espe-cially interesting A research project focusing ongrid dasymetric maps could determine optimalgrid cell sizes for varied map scales For examplewe expect that use of excessively small grid cellsfor a given scale would lead to higher map errorSuch research would also provide an analyticalmethod of balancing increases in accuracy affordedby smaller grid cells with the increased storagerequirements and processing times associated withlarger data files See Eicher (1999) for further dis-cussion of this topic

Research Question C Dasymetric Mapsand Error MapsVisual inspection of dasymetric maps and error mapssupported the results from the statisticalanalysispor-tion of the experiment and offered additional insightinto the usefulness of dasymetric maps It is our opin-ion that the resulting dasymetric maps are visuallyeffective showing more accurate distributions thancounty aggregations provide

Classification (the data ranges represented bymap colors) for dasymetric maps was not closely

136

examined in this work Future work could investi-gate whether classifications that have been shownto work well for choropleth maps also work wellfor dasymetric maps An interesting possibility isthe use of contiguity-based classifications (Cromley1996) which might make dasymetric maps visually

more dasymetric Contiguity-based classificationsconsider differences across boundaries betweenadjacent polygons and this strategy is particularlywell suited for application to boundaries on dasy-metric maps Dasymetric boundaries are moreclosely related to the underlying data distributionthan boundaries on choropleth maps because dasy-metric zones are more internally homogeneous

Many possibilities exist for extending calcula-tion and comparison of error in dasymetric mapsFuture work could include a number of between-county factors or covariates in the repeated-mea-sures ANOVAThese could test the relationshipbetween map error and land-use areas popula-tion densitygeographic location and other factorsAlso different approaches could be taken to evalu-ating the quality of dasymetric maps in an exper-iment A defining property of dasymetric mapsis homogeneity within map zones however thisproperty was not formally tested in this researchIt may be possible to compute the variance foreach map zone using values from the block groupscomprising that zone Variances within each mapzone could then be compared to variances betweenmap zones to establish a measure of dasymetricity(homogeneity within zones combined with distinct-ness of differences between adjacent zones) To fur-ther examine dasymetric mapping methods andthe properties of dasymetric maps a number ofdifferent map zone configurations could be usedFor example maps resulting from different ancil-lary variables such as land use soil types or slopezones could be compared Likewisemaps resultingfrom different source zones such as counties votingdistricts and economic planning areas could becompared

SummaryWith a structured experiment we determinedthat the traditional limiting variable method wasthe most accurate of those tested and that griddasymetric mapping methods produced maps ofsimilar accuracy to their polygonal counterpartsWrights limiting variable concept developed 65years ago has maintained its utility This tradi-tional method has been updated and adapted suc-cessfully to produce dasymetric maps in a GISmapping environment Additionally the availabil-

CartogmPhy and Geogmphic Information Science

ity of GIS technology has made data processingfor dasymetric map production a more pragmaticprocess Work (such as this paper) on dasymetricmapping research on areal interpolation andincreased use of GIS by cartographers are firststeps toward producing a formal set of dasymetricmapping methods that updates George McClearys32-year-old typology GIS software packages mayeventually offer full dasymetric mapping function-ality or at least provide the means to make thesetypes of maps more easily

ACKNOWLEDGEMENTSSpecial thanks must be given to Jeremy Mennisand Donna Peuquet for their comments and con-structive criticism on this research We appreciateTerry Slocums careful edits of the manuscript

REFERENCESAnderson J R 1967 Major land uses in the United

States In US Geological SurveyNational atlas of theUnited States of America U S Geological SurveyWash-ington DC pp 158-9

Bloom L M PJ Pedler and G E Wragg 1996 Imple-mentation of enhanced areal interpolation using Map-info Computers amp Geosciences 22(5) 459-66

Charpentier M 1997 The dasymetric method with com-ments on its use in GIS In 1997 UCGISAnnualAssem-bly amp Summer Retreat The University Consortium forGeographic Information Science June 15-20 BarHarbor Maine

Chrisman N R 1997 Exploring geograPhic informationsystems New York New YorkJohn Wileyamp Sons

Chrisman N R 1998 Rethinking levels of measure-ment for cartography Cartography and GeograPhicInformation Systems 25(4) 231-42

Cockings S P Fisher and M Langford 1997 Param-eterization and visualization of the errors in arealinterpolation GeograPhical Analysis 29(4) 314-328

Cromley R G 1996 A comparison of optimal classifica-tion strategies for choroplethic displays of spatiallyaggregated data International journal of GeograPhicalInformation Systems 10(4) 405-24

DeMers M N 1997 Fundamentals of geograPhic informa-tion systems New York New YorkJohn Wileyamp Sons

Eicher C L 1999 Implementation and evaluation ofdasymetric mapping methods MS Thesis Pennsyl-vania State University

Fisher Pand M Langford 1995 Modeling the errors inareal interpolation between zonal systems by MonteCarlo simulation Environment and Planning A 27211-24

Fisher Pand M Langford 1996 Modeling sensitivity toaccuracy in classified imagery A study of areal inter-polation by dasymetric mapping Professional Geogra-pher 48(3) 299-309

Flowerdew R and M Green 1989 Statistical methodsfor inference between incompatible zonal systemsIn M Goodchild and S Gopal (eds) Accuracy of spa-

Vol 28 No2

tial databases London UK Taylor and Francis pp239-47

Flowerdew R and M Green 1991 Data integrationStatistical methods for transferring data betweenzonal systems In l Masser and M B Blakemore(eds) Handling geographical information Essex UKLongman Scientific amp Technical pp 38-53

Flowerdew R and M Green 1992 Developmentsin areal interpolation methods and GIS Annals ofRegional Science 26 67 -78

Flowerdew R and M Green 1994 Areal interpolationand types of data In S Fotheringham and P Rog-erson (eds) Spatial analysis and GIS London uKTaylor and Francis pp 121-45

Flowerdew R M Green and E Kehris 1991 Usingareal interpolation methods in GIS Papers in RegionalScience The journal of the RSAI 70(3) 303-15

Gerth J 1993 Towards improved spatial analysis withareal units The use of GIS to facilitate the creation ofdasymetric maps Masters Paper Ohio State Univer-sity

Goodchild M F and N S Lam 1980 Areal interpola-tion A variant of the traditional spatial problem Geo-processing I 297-312

Goodchild M F L Anselin and U Deichmann 1993A framework for the areal interpolation of socioeco-nomic data Environment and Planning A 25 383-97

Holloway S R J Schumacher and R Redmond 1996People and place Dasymetric mapping using Arc)Info Missoula Wildlife Spatial Analysis Lab Univer-sity of Montana

Hitt K J 1991 Digital map file of major land uses inthe United States US Geological Survey AccessedOctober 1998 ArcInfo coverage available [romhttpwa ter usgs govG ISmetadatausgswrdllandusehtml

Langford M D J Maguire and D J Unwin 1990Mapping the density of population Continuous sur-face representations as an alternative to choroplethand dasymetric maps Leicester UK MidlandsRegional Research Laboratory University of Leicesterand Loughborough University of Technology UK

Langford M D J Maguire and D J Unwin 1991The Areal interpolation problem Estimating popula-tion using remote sensing in a GIS framework Inl Masser and M B Blakemore (eds) Handling geo-graPhic information Essex UK Longman Scientificamp Technical pp 55-77

Langford M and D J Unwin 1994 Generating andmapping population density surfaces within a geo-graphical information system The Cartographic jour-nal31 21-6

MacEachren A M 1994 Some truth with maps A primeron symbolimtion and design Washington DC Associa-tion of American Geographers

McCleary G FJr 1969 The dasymetric method in the-matic cartography PhD Dissertation University ofWisconsin at Madison

McCleary G FJr 1984 Cartography geography andthe dasymetric method In Proceeding 12th Conferenceof International CartograPhic Association August 6-13Perth Australia 1599-610

137

c Count Error

20000 to 47335

10000 to 20000

5000 to 10000

-5000 to 5000

-10000 to -5000

-20000 to -10000

-382151 to -20000

Figure 6 Maps for traditional limiting variable method (P3) a dasymetric population density map bpercent error and c count error in number of persons

Mennis J 1999 Using GIS spatial statistics and visual-ization to investigate environmental justice Paper pre-sented at the Joint Statistical Meetings American Statisti-cal Association Baltimore Md August 8-12 1999

Monmonier M 1982 Flat laxity optimization androunding in the selection of class intervals Cartograph-ica 19(1) 16-27

Monmonier M and G A Schnell 1984 Land use andland cover data and the mapping of population den-sity International Yearbook of CartograPhy 24 24-9

Moxey A and P Allanson 1994 Areal interpolation ofspatially extensive variables A comparison of alterna-tive techniques International journal of GeograPhicalInformation Systems 8(5) 479-87

138

Robinson A J Morrison P Muehrcke A Kimerlingand S Guptill 1995 Elements of cartograPhy 6th edNew York New York John Wiley amp Sons

Slocum T A 1999 Thematic cartography and visualimtionUpper Saddle River New Jersey Prentice Hall

SPSS Inc 1999 SPSS help venion 90 SPSS Incorpo-rated Chicago Illinois

US Geological Survey 1970National atlas of the United Statesof Ame1Ua us Geological Surveyashington DC

US Geological Survey 1993 National water summary1990-91 Hydrologic events and stream water qualityUS Geological Survey water supply paper 2400 USGeological Survey Washington DC

Wright I K 1936 A method of mapping densities ofpopulation The GeograPhical Review pp 103-10

CartograPhy and Geographic Information Science

Page 11: 50;93. *-88361 -6/ 90-4 (6;09874-;376! (5840506;-;376 … lit...%-:?50;93. *-88361 -6/ "90-4 (6;09874-;376! (5840506;-;376 -6/ &=-4

a Population Density b Percent Error c Count Error(persons per km2)

050 to 127028 20000 to 26418

25 to 50 10000 to 20000o to 25

10 to 25 5000 to 1000025 to 75

-10 to 10 -5000 to 5000

75 to 175-25to-10 -10000 to -5000

175 to 375 -50 to -25 -20000 to -10000

375 to 16525064 -100 to -50 -288114 to -20000

Figure 5 Maps for three-class method (P2) a dasymetric population density map b percent error and ccount error in number of persons

Conclusions and Future Research

Research Question A Mapping MethodsThere were significant differences in accuracy forthe five dasymetric mapping methods tested inthis experiment The limiting variable method(P3) produced maps with significantly lower errorthan maps made using the other methods Three-class methods (P2 and G2) produced maps witherrors that were not significantly different thanthe errors of maps produced with binary methods(PI and GI)

Though the traditional limiting variablemethod (P3) is over a half-century old (Wright1936) we have shown that it can be implementedin a GIS to produce accurate dasymetric mapsThe success of this method may in part be due

11il28 No2

to its customized approach Threshold values usedto shift data between zones were based on thedata distribution for each mapped variable Thisapproach contrasts with three-class methods P2and G2 in which the same 70-20-10 percentageweightings were applied to all variables Futurework could examine the effect of using differentthreshold values for the land-use classesAlso impor-tant are methods of determining the P3 thresholdvalues Perhaps the reassignment schemes used forP3 could also be made less arbitrary and morespecific to the geography of each source zone bymaking the thresholds dependent on the land-useareas within each source zone (Mennis 1999) Forexample a threshold of 20 people per square kilo-meter used for agriculturaVwoodland areas couldbe increased to 30 in counties with more than 50

135

percent urban land uses In the future a standard-ized and generalizable decision process with a sta-tistical basis might be developed for this purpose

Research Question B Polygon Methodsversus Grid MethodsBased on a pairwise comparison of polygon andgrid versions of the binary method (PI and GI)and the three-class method (P2 and G2) errors forpolygon dasymetric maps were not significantlydifferent than for grid maps In our implementa-tion the use of grid methods was more convenientWithin the ArcView environment grid (raster) ver-sions of our datasets took up less storage spaceand the grid methods were easier to implementusing Avenue scripting Processing was also fasterin ArcViewbecause shorter scripts were typical ofthe grid methods (Eicher 1999) Grid methodswere developed in the areal interpolation litera-ture but have not been popular for dasymetricmapping Their effectiveness in producing dasy-metric maps in this experiment as well as theirease of implementation suggests that they shouldbe used more frequently In contrast advantagesof the equally accurate polygon methods includegreater flexibility for publication-quality mappingsuch as ease of adjusting line weights to suit differ-ent presentation sizes

Future work could study the effect of grid cellsize on the accuracy of dasymetric maps The rela-tionship between cell size and map scale is espe-cially interesting A research project focusing ongrid dasymetric maps could determine optimalgrid cell sizes for varied map scales For examplewe expect that use of excessively small grid cellsfor a given scale would lead to higher map errorSuch research would also provide an analyticalmethod of balancing increases in accuracy affordedby smaller grid cells with the increased storagerequirements and processing times associated withlarger data files See Eicher (1999) for further dis-cussion of this topic

Research Question C Dasymetric Mapsand Error MapsVisual inspection of dasymetric maps and error mapssupported the results from the statisticalanalysispor-tion of the experiment and offered additional insightinto the usefulness of dasymetric maps It is our opin-ion that the resulting dasymetric maps are visuallyeffective showing more accurate distributions thancounty aggregations provide

Classification (the data ranges represented bymap colors) for dasymetric maps was not closely

136

examined in this work Future work could investi-gate whether classifications that have been shownto work well for choropleth maps also work wellfor dasymetric maps An interesting possibility isthe use of contiguity-based classifications (Cromley1996) which might make dasymetric maps visually

more dasymetric Contiguity-based classificationsconsider differences across boundaries betweenadjacent polygons and this strategy is particularlywell suited for application to boundaries on dasy-metric maps Dasymetric boundaries are moreclosely related to the underlying data distributionthan boundaries on choropleth maps because dasy-metric zones are more internally homogeneous

Many possibilities exist for extending calcula-tion and comparison of error in dasymetric mapsFuture work could include a number of between-county factors or covariates in the repeated-mea-sures ANOVAThese could test the relationshipbetween map error and land-use areas popula-tion densitygeographic location and other factorsAlso different approaches could be taken to evalu-ating the quality of dasymetric maps in an exper-iment A defining property of dasymetric mapsis homogeneity within map zones however thisproperty was not formally tested in this researchIt may be possible to compute the variance foreach map zone using values from the block groupscomprising that zone Variances within each mapzone could then be compared to variances betweenmap zones to establish a measure of dasymetricity(homogeneity within zones combined with distinct-ness of differences between adjacent zones) To fur-ther examine dasymetric mapping methods andthe properties of dasymetric maps a number ofdifferent map zone configurations could be usedFor example maps resulting from different ancil-lary variables such as land use soil types or slopezones could be compared Likewisemaps resultingfrom different source zones such as counties votingdistricts and economic planning areas could becompared

SummaryWith a structured experiment we determinedthat the traditional limiting variable method wasthe most accurate of those tested and that griddasymetric mapping methods produced maps ofsimilar accuracy to their polygonal counterpartsWrights limiting variable concept developed 65years ago has maintained its utility This tradi-tional method has been updated and adapted suc-cessfully to produce dasymetric maps in a GISmapping environment Additionally the availabil-

CartogmPhy and Geogmphic Information Science

ity of GIS technology has made data processingfor dasymetric map production a more pragmaticprocess Work (such as this paper) on dasymetricmapping research on areal interpolation andincreased use of GIS by cartographers are firststeps toward producing a formal set of dasymetricmapping methods that updates George McClearys32-year-old typology GIS software packages mayeventually offer full dasymetric mapping function-ality or at least provide the means to make thesetypes of maps more easily

ACKNOWLEDGEMENTSSpecial thanks must be given to Jeremy Mennisand Donna Peuquet for their comments and con-structive criticism on this research We appreciateTerry Slocums careful edits of the manuscript

REFERENCESAnderson J R 1967 Major land uses in the United

States In US Geological SurveyNational atlas of theUnited States of America U S Geological SurveyWash-ington DC pp 158-9

Bloom L M PJ Pedler and G E Wragg 1996 Imple-mentation of enhanced areal interpolation using Map-info Computers amp Geosciences 22(5) 459-66

Charpentier M 1997 The dasymetric method with com-ments on its use in GIS In 1997 UCGISAnnualAssem-bly amp Summer Retreat The University Consortium forGeographic Information Science June 15-20 BarHarbor Maine

Chrisman N R 1997 Exploring geograPhic informationsystems New York New YorkJohn Wileyamp Sons

Chrisman N R 1998 Rethinking levels of measure-ment for cartography Cartography and GeograPhicInformation Systems 25(4) 231-42

Cockings S P Fisher and M Langford 1997 Param-eterization and visualization of the errors in arealinterpolation GeograPhical Analysis 29(4) 314-328

Cromley R G 1996 A comparison of optimal classifica-tion strategies for choroplethic displays of spatiallyaggregated data International journal of GeograPhicalInformation Systems 10(4) 405-24

DeMers M N 1997 Fundamentals of geograPhic informa-tion systems New York New YorkJohn Wileyamp Sons

Eicher C L 1999 Implementation and evaluation ofdasymetric mapping methods MS Thesis Pennsyl-vania State University

Fisher Pand M Langford 1995 Modeling the errors inareal interpolation between zonal systems by MonteCarlo simulation Environment and Planning A 27211-24

Fisher Pand M Langford 1996 Modeling sensitivity toaccuracy in classified imagery A study of areal inter-polation by dasymetric mapping Professional Geogra-pher 48(3) 299-309

Flowerdew R and M Green 1989 Statistical methodsfor inference between incompatible zonal systemsIn M Goodchild and S Gopal (eds) Accuracy of spa-

Vol 28 No2

tial databases London UK Taylor and Francis pp239-47

Flowerdew R and M Green 1991 Data integrationStatistical methods for transferring data betweenzonal systems In l Masser and M B Blakemore(eds) Handling geographical information Essex UKLongman Scientific amp Technical pp 38-53

Flowerdew R and M Green 1992 Developmentsin areal interpolation methods and GIS Annals ofRegional Science 26 67 -78

Flowerdew R and M Green 1994 Areal interpolationand types of data In S Fotheringham and P Rog-erson (eds) Spatial analysis and GIS London uKTaylor and Francis pp 121-45

Flowerdew R M Green and E Kehris 1991 Usingareal interpolation methods in GIS Papers in RegionalScience The journal of the RSAI 70(3) 303-15

Gerth J 1993 Towards improved spatial analysis withareal units The use of GIS to facilitate the creation ofdasymetric maps Masters Paper Ohio State Univer-sity

Goodchild M F and N S Lam 1980 Areal interpola-tion A variant of the traditional spatial problem Geo-processing I 297-312

Goodchild M F L Anselin and U Deichmann 1993A framework for the areal interpolation of socioeco-nomic data Environment and Planning A 25 383-97

Holloway S R J Schumacher and R Redmond 1996People and place Dasymetric mapping using Arc)Info Missoula Wildlife Spatial Analysis Lab Univer-sity of Montana

Hitt K J 1991 Digital map file of major land uses inthe United States US Geological Survey AccessedOctober 1998 ArcInfo coverage available [romhttpwa ter usgs govG ISmetadatausgswrdllandusehtml

Langford M D J Maguire and D J Unwin 1990Mapping the density of population Continuous sur-face representations as an alternative to choroplethand dasymetric maps Leicester UK MidlandsRegional Research Laboratory University of Leicesterand Loughborough University of Technology UK

Langford M D J Maguire and D J Unwin 1991The Areal interpolation problem Estimating popula-tion using remote sensing in a GIS framework Inl Masser and M B Blakemore (eds) Handling geo-graPhic information Essex UK Longman Scientificamp Technical pp 55-77

Langford M and D J Unwin 1994 Generating andmapping population density surfaces within a geo-graphical information system The Cartographic jour-nal31 21-6

MacEachren A M 1994 Some truth with maps A primeron symbolimtion and design Washington DC Associa-tion of American Geographers

McCleary G FJr 1969 The dasymetric method in the-matic cartography PhD Dissertation University ofWisconsin at Madison

McCleary G FJr 1984 Cartography geography andthe dasymetric method In Proceeding 12th Conferenceof International CartograPhic Association August 6-13Perth Australia 1599-610

137

c Count Error

20000 to 47335

10000 to 20000

5000 to 10000

-5000 to 5000

-10000 to -5000

-20000 to -10000

-382151 to -20000

Figure 6 Maps for traditional limiting variable method (P3) a dasymetric population density map bpercent error and c count error in number of persons

Mennis J 1999 Using GIS spatial statistics and visual-ization to investigate environmental justice Paper pre-sented at the Joint Statistical Meetings American Statisti-cal Association Baltimore Md August 8-12 1999

Monmonier M 1982 Flat laxity optimization androunding in the selection of class intervals Cartograph-ica 19(1) 16-27

Monmonier M and G A Schnell 1984 Land use andland cover data and the mapping of population den-sity International Yearbook of CartograPhy 24 24-9

Moxey A and P Allanson 1994 Areal interpolation ofspatially extensive variables A comparison of alterna-tive techniques International journal of GeograPhicalInformation Systems 8(5) 479-87

138

Robinson A J Morrison P Muehrcke A Kimerlingand S Guptill 1995 Elements of cartograPhy 6th edNew York New York John Wiley amp Sons

Slocum T A 1999 Thematic cartography and visualimtionUpper Saddle River New Jersey Prentice Hall

SPSS Inc 1999 SPSS help venion 90 SPSS Incorpo-rated Chicago Illinois

US Geological Survey 1970National atlas of the United Statesof Ame1Ua us Geological Surveyashington DC

US Geological Survey 1993 National water summary1990-91 Hydrologic events and stream water qualityUS Geological Survey water supply paper 2400 USGeological Survey Washington DC

Wright I K 1936 A method of mapping densities ofpopulation The GeograPhical Review pp 103-10

CartograPhy and Geographic Information Science

Page 12: 50;93. *-88361 -6/ 90-4 (6;09874-;376! (5840506;-;376 … lit...%-:?50;93. *-88361 -6/ "90-4 (6;09874-;376! (5840506;-;376 -6/ &=-4

percent urban land uses In the future a standard-ized and generalizable decision process with a sta-tistical basis might be developed for this purpose

Research Question B Polygon Methodsversus Grid MethodsBased on a pairwise comparison of polygon andgrid versions of the binary method (PI and GI)and the three-class method (P2 and G2) errors forpolygon dasymetric maps were not significantlydifferent than for grid maps In our implementa-tion the use of grid methods was more convenientWithin the ArcView environment grid (raster) ver-sions of our datasets took up less storage spaceand the grid methods were easier to implementusing Avenue scripting Processing was also fasterin ArcViewbecause shorter scripts were typical ofthe grid methods (Eicher 1999) Grid methodswere developed in the areal interpolation litera-ture but have not been popular for dasymetricmapping Their effectiveness in producing dasy-metric maps in this experiment as well as theirease of implementation suggests that they shouldbe used more frequently In contrast advantagesof the equally accurate polygon methods includegreater flexibility for publication-quality mappingsuch as ease of adjusting line weights to suit differ-ent presentation sizes

Future work could study the effect of grid cellsize on the accuracy of dasymetric maps The rela-tionship between cell size and map scale is espe-cially interesting A research project focusing ongrid dasymetric maps could determine optimalgrid cell sizes for varied map scales For examplewe expect that use of excessively small grid cellsfor a given scale would lead to higher map errorSuch research would also provide an analyticalmethod of balancing increases in accuracy affordedby smaller grid cells with the increased storagerequirements and processing times associated withlarger data files See Eicher (1999) for further dis-cussion of this topic

Research Question C Dasymetric Mapsand Error MapsVisual inspection of dasymetric maps and error mapssupported the results from the statisticalanalysispor-tion of the experiment and offered additional insightinto the usefulness of dasymetric maps It is our opin-ion that the resulting dasymetric maps are visuallyeffective showing more accurate distributions thancounty aggregations provide

Classification (the data ranges represented bymap colors) for dasymetric maps was not closely

136

examined in this work Future work could investi-gate whether classifications that have been shownto work well for choropleth maps also work wellfor dasymetric maps An interesting possibility isthe use of contiguity-based classifications (Cromley1996) which might make dasymetric maps visually

more dasymetric Contiguity-based classificationsconsider differences across boundaries betweenadjacent polygons and this strategy is particularlywell suited for application to boundaries on dasy-metric maps Dasymetric boundaries are moreclosely related to the underlying data distributionthan boundaries on choropleth maps because dasy-metric zones are more internally homogeneous

Many possibilities exist for extending calcula-tion and comparison of error in dasymetric mapsFuture work could include a number of between-county factors or covariates in the repeated-mea-sures ANOVAThese could test the relationshipbetween map error and land-use areas popula-tion densitygeographic location and other factorsAlso different approaches could be taken to evalu-ating the quality of dasymetric maps in an exper-iment A defining property of dasymetric mapsis homogeneity within map zones however thisproperty was not formally tested in this researchIt may be possible to compute the variance foreach map zone using values from the block groupscomprising that zone Variances within each mapzone could then be compared to variances betweenmap zones to establish a measure of dasymetricity(homogeneity within zones combined with distinct-ness of differences between adjacent zones) To fur-ther examine dasymetric mapping methods andthe properties of dasymetric maps a number ofdifferent map zone configurations could be usedFor example maps resulting from different ancil-lary variables such as land use soil types or slopezones could be compared Likewisemaps resultingfrom different source zones such as counties votingdistricts and economic planning areas could becompared

SummaryWith a structured experiment we determinedthat the traditional limiting variable method wasthe most accurate of those tested and that griddasymetric mapping methods produced maps ofsimilar accuracy to their polygonal counterpartsWrights limiting variable concept developed 65years ago has maintained its utility This tradi-tional method has been updated and adapted suc-cessfully to produce dasymetric maps in a GISmapping environment Additionally the availabil-

CartogmPhy and Geogmphic Information Science

ity of GIS technology has made data processingfor dasymetric map production a more pragmaticprocess Work (such as this paper) on dasymetricmapping research on areal interpolation andincreased use of GIS by cartographers are firststeps toward producing a formal set of dasymetricmapping methods that updates George McClearys32-year-old typology GIS software packages mayeventually offer full dasymetric mapping function-ality or at least provide the means to make thesetypes of maps more easily

ACKNOWLEDGEMENTSSpecial thanks must be given to Jeremy Mennisand Donna Peuquet for their comments and con-structive criticism on this research We appreciateTerry Slocums careful edits of the manuscript

REFERENCESAnderson J R 1967 Major land uses in the United

States In US Geological SurveyNational atlas of theUnited States of America U S Geological SurveyWash-ington DC pp 158-9

Bloom L M PJ Pedler and G E Wragg 1996 Imple-mentation of enhanced areal interpolation using Map-info Computers amp Geosciences 22(5) 459-66

Charpentier M 1997 The dasymetric method with com-ments on its use in GIS In 1997 UCGISAnnualAssem-bly amp Summer Retreat The University Consortium forGeographic Information Science June 15-20 BarHarbor Maine

Chrisman N R 1997 Exploring geograPhic informationsystems New York New YorkJohn Wileyamp Sons

Chrisman N R 1998 Rethinking levels of measure-ment for cartography Cartography and GeograPhicInformation Systems 25(4) 231-42

Cockings S P Fisher and M Langford 1997 Param-eterization and visualization of the errors in arealinterpolation GeograPhical Analysis 29(4) 314-328

Cromley R G 1996 A comparison of optimal classifica-tion strategies for choroplethic displays of spatiallyaggregated data International journal of GeograPhicalInformation Systems 10(4) 405-24

DeMers M N 1997 Fundamentals of geograPhic informa-tion systems New York New YorkJohn Wileyamp Sons

Eicher C L 1999 Implementation and evaluation ofdasymetric mapping methods MS Thesis Pennsyl-vania State University

Fisher Pand M Langford 1995 Modeling the errors inareal interpolation between zonal systems by MonteCarlo simulation Environment and Planning A 27211-24

Fisher Pand M Langford 1996 Modeling sensitivity toaccuracy in classified imagery A study of areal inter-polation by dasymetric mapping Professional Geogra-pher 48(3) 299-309

Flowerdew R and M Green 1989 Statistical methodsfor inference between incompatible zonal systemsIn M Goodchild and S Gopal (eds) Accuracy of spa-

Vol 28 No2

tial databases London UK Taylor and Francis pp239-47

Flowerdew R and M Green 1991 Data integrationStatistical methods for transferring data betweenzonal systems In l Masser and M B Blakemore(eds) Handling geographical information Essex UKLongman Scientific amp Technical pp 38-53

Flowerdew R and M Green 1992 Developmentsin areal interpolation methods and GIS Annals ofRegional Science 26 67 -78

Flowerdew R and M Green 1994 Areal interpolationand types of data In S Fotheringham and P Rog-erson (eds) Spatial analysis and GIS London uKTaylor and Francis pp 121-45

Flowerdew R M Green and E Kehris 1991 Usingareal interpolation methods in GIS Papers in RegionalScience The journal of the RSAI 70(3) 303-15

Gerth J 1993 Towards improved spatial analysis withareal units The use of GIS to facilitate the creation ofdasymetric maps Masters Paper Ohio State Univer-sity

Goodchild M F and N S Lam 1980 Areal interpola-tion A variant of the traditional spatial problem Geo-processing I 297-312

Goodchild M F L Anselin and U Deichmann 1993A framework for the areal interpolation of socioeco-nomic data Environment and Planning A 25 383-97

Holloway S R J Schumacher and R Redmond 1996People and place Dasymetric mapping using Arc)Info Missoula Wildlife Spatial Analysis Lab Univer-sity of Montana

Hitt K J 1991 Digital map file of major land uses inthe United States US Geological Survey AccessedOctober 1998 ArcInfo coverage available [romhttpwa ter usgs govG ISmetadatausgswrdllandusehtml

Langford M D J Maguire and D J Unwin 1990Mapping the density of population Continuous sur-face representations as an alternative to choroplethand dasymetric maps Leicester UK MidlandsRegional Research Laboratory University of Leicesterand Loughborough University of Technology UK

Langford M D J Maguire and D J Unwin 1991The Areal interpolation problem Estimating popula-tion using remote sensing in a GIS framework Inl Masser and M B Blakemore (eds) Handling geo-graPhic information Essex UK Longman Scientificamp Technical pp 55-77

Langford M and D J Unwin 1994 Generating andmapping population density surfaces within a geo-graphical information system The Cartographic jour-nal31 21-6

MacEachren A M 1994 Some truth with maps A primeron symbolimtion and design Washington DC Associa-tion of American Geographers

McCleary G FJr 1969 The dasymetric method in the-matic cartography PhD Dissertation University ofWisconsin at Madison

McCleary G FJr 1984 Cartography geography andthe dasymetric method In Proceeding 12th Conferenceof International CartograPhic Association August 6-13Perth Australia 1599-610

137

c Count Error

20000 to 47335

10000 to 20000

5000 to 10000

-5000 to 5000

-10000 to -5000

-20000 to -10000

-382151 to -20000

Figure 6 Maps for traditional limiting variable method (P3) a dasymetric population density map bpercent error and c count error in number of persons

Mennis J 1999 Using GIS spatial statistics and visual-ization to investigate environmental justice Paper pre-sented at the Joint Statistical Meetings American Statisti-cal Association Baltimore Md August 8-12 1999

Monmonier M 1982 Flat laxity optimization androunding in the selection of class intervals Cartograph-ica 19(1) 16-27

Monmonier M and G A Schnell 1984 Land use andland cover data and the mapping of population den-sity International Yearbook of CartograPhy 24 24-9

Moxey A and P Allanson 1994 Areal interpolation ofspatially extensive variables A comparison of alterna-tive techniques International journal of GeograPhicalInformation Systems 8(5) 479-87

138

Robinson A J Morrison P Muehrcke A Kimerlingand S Guptill 1995 Elements of cartograPhy 6th edNew York New York John Wiley amp Sons

Slocum T A 1999 Thematic cartography and visualimtionUpper Saddle River New Jersey Prentice Hall

SPSS Inc 1999 SPSS help venion 90 SPSS Incorpo-rated Chicago Illinois

US Geological Survey 1970National atlas of the United Statesof Ame1Ua us Geological Surveyashington DC

US Geological Survey 1993 National water summary1990-91 Hydrologic events and stream water qualityUS Geological Survey water supply paper 2400 USGeological Survey Washington DC

Wright I K 1936 A method of mapping densities ofpopulation The GeograPhical Review pp 103-10

CartograPhy and Geographic Information Science

Page 13: 50;93. *-88361 -6/ 90-4 (6;09874-;376! (5840506;-;376 … lit...%-:?50;93. *-88361 -6/ "90-4 (6;09874-;376! (5840506;-;376 -6/ &=-4

ity of GIS technology has made data processingfor dasymetric map production a more pragmaticprocess Work (such as this paper) on dasymetricmapping research on areal interpolation andincreased use of GIS by cartographers are firststeps toward producing a formal set of dasymetricmapping methods that updates George McClearys32-year-old typology GIS software packages mayeventually offer full dasymetric mapping function-ality or at least provide the means to make thesetypes of maps more easily

ACKNOWLEDGEMENTSSpecial thanks must be given to Jeremy Mennisand Donna Peuquet for their comments and con-structive criticism on this research We appreciateTerry Slocums careful edits of the manuscript

REFERENCESAnderson J R 1967 Major land uses in the United

States In US Geological SurveyNational atlas of theUnited States of America U S Geological SurveyWash-ington DC pp 158-9

Bloom L M PJ Pedler and G E Wragg 1996 Imple-mentation of enhanced areal interpolation using Map-info Computers amp Geosciences 22(5) 459-66

Charpentier M 1997 The dasymetric method with com-ments on its use in GIS In 1997 UCGISAnnualAssem-bly amp Summer Retreat The University Consortium forGeographic Information Science June 15-20 BarHarbor Maine

Chrisman N R 1997 Exploring geograPhic informationsystems New York New YorkJohn Wileyamp Sons

Chrisman N R 1998 Rethinking levels of measure-ment for cartography Cartography and GeograPhicInformation Systems 25(4) 231-42

Cockings S P Fisher and M Langford 1997 Param-eterization and visualization of the errors in arealinterpolation GeograPhical Analysis 29(4) 314-328

Cromley R G 1996 A comparison of optimal classifica-tion strategies for choroplethic displays of spatiallyaggregated data International journal of GeograPhicalInformation Systems 10(4) 405-24

DeMers M N 1997 Fundamentals of geograPhic informa-tion systems New York New YorkJohn Wileyamp Sons

Eicher C L 1999 Implementation and evaluation ofdasymetric mapping methods MS Thesis Pennsyl-vania State University

Fisher Pand M Langford 1995 Modeling the errors inareal interpolation between zonal systems by MonteCarlo simulation Environment and Planning A 27211-24

Fisher Pand M Langford 1996 Modeling sensitivity toaccuracy in classified imagery A study of areal inter-polation by dasymetric mapping Professional Geogra-pher 48(3) 299-309

Flowerdew R and M Green 1989 Statistical methodsfor inference between incompatible zonal systemsIn M Goodchild and S Gopal (eds) Accuracy of spa-

Vol 28 No2

tial databases London UK Taylor and Francis pp239-47

Flowerdew R and M Green 1991 Data integrationStatistical methods for transferring data betweenzonal systems In l Masser and M B Blakemore(eds) Handling geographical information Essex UKLongman Scientific amp Technical pp 38-53

Flowerdew R and M Green 1992 Developmentsin areal interpolation methods and GIS Annals ofRegional Science 26 67 -78

Flowerdew R and M Green 1994 Areal interpolationand types of data In S Fotheringham and P Rog-erson (eds) Spatial analysis and GIS London uKTaylor and Francis pp 121-45

Flowerdew R M Green and E Kehris 1991 Usingareal interpolation methods in GIS Papers in RegionalScience The journal of the RSAI 70(3) 303-15

Gerth J 1993 Towards improved spatial analysis withareal units The use of GIS to facilitate the creation ofdasymetric maps Masters Paper Ohio State Univer-sity

Goodchild M F and N S Lam 1980 Areal interpola-tion A variant of the traditional spatial problem Geo-processing I 297-312

Goodchild M F L Anselin and U Deichmann 1993A framework for the areal interpolation of socioeco-nomic data Environment and Planning A 25 383-97

Holloway S R J Schumacher and R Redmond 1996People and place Dasymetric mapping using Arc)Info Missoula Wildlife Spatial Analysis Lab Univer-sity of Montana

Hitt K J 1991 Digital map file of major land uses inthe United States US Geological Survey AccessedOctober 1998 ArcInfo coverage available [romhttpwa ter usgs govG ISmetadatausgswrdllandusehtml

Langford M D J Maguire and D J Unwin 1990Mapping the density of population Continuous sur-face representations as an alternative to choroplethand dasymetric maps Leicester UK MidlandsRegional Research Laboratory University of Leicesterand Loughborough University of Technology UK

Langford M D J Maguire and D J Unwin 1991The Areal interpolation problem Estimating popula-tion using remote sensing in a GIS framework Inl Masser and M B Blakemore (eds) Handling geo-graPhic information Essex UK Longman Scientificamp Technical pp 55-77

Langford M and D J Unwin 1994 Generating andmapping population density surfaces within a geo-graphical information system The Cartographic jour-nal31 21-6

MacEachren A M 1994 Some truth with maps A primeron symbolimtion and design Washington DC Associa-tion of American Geographers

McCleary G FJr 1969 The dasymetric method in the-matic cartography PhD Dissertation University ofWisconsin at Madison

McCleary G FJr 1984 Cartography geography andthe dasymetric method In Proceeding 12th Conferenceof International CartograPhic Association August 6-13Perth Australia 1599-610

137

c Count Error

20000 to 47335

10000 to 20000

5000 to 10000

-5000 to 5000

-10000 to -5000

-20000 to -10000

-382151 to -20000

Figure 6 Maps for traditional limiting variable method (P3) a dasymetric population density map bpercent error and c count error in number of persons

Mennis J 1999 Using GIS spatial statistics and visual-ization to investigate environmental justice Paper pre-sented at the Joint Statistical Meetings American Statisti-cal Association Baltimore Md August 8-12 1999

Monmonier M 1982 Flat laxity optimization androunding in the selection of class intervals Cartograph-ica 19(1) 16-27

Monmonier M and G A Schnell 1984 Land use andland cover data and the mapping of population den-sity International Yearbook of CartograPhy 24 24-9

Moxey A and P Allanson 1994 Areal interpolation ofspatially extensive variables A comparison of alterna-tive techniques International journal of GeograPhicalInformation Systems 8(5) 479-87

138

Robinson A J Morrison P Muehrcke A Kimerlingand S Guptill 1995 Elements of cartograPhy 6th edNew York New York John Wiley amp Sons

Slocum T A 1999 Thematic cartography and visualimtionUpper Saddle River New Jersey Prentice Hall

SPSS Inc 1999 SPSS help venion 90 SPSS Incorpo-rated Chicago Illinois

US Geological Survey 1970National atlas of the United Statesof Ame1Ua us Geological Surveyashington DC

US Geological Survey 1993 National water summary1990-91 Hydrologic events and stream water qualityUS Geological Survey water supply paper 2400 USGeological Survey Washington DC

Wright I K 1936 A method of mapping densities ofpopulation The GeograPhical Review pp 103-10

CartograPhy and Geographic Information Science

Page 14: 50;93. *-88361 -6/ 90-4 (6;09874-;376! (5840506;-;376 … lit...%-:?50;93. *-88361 -6/ "90-4 (6;09874-;376! (5840506;-;376 -6/ &=-4

c Count Error

20000 to 47335

10000 to 20000

5000 to 10000

-5000 to 5000

-10000 to -5000

-20000 to -10000

-382151 to -20000

Figure 6 Maps for traditional limiting variable method (P3) a dasymetric population density map bpercent error and c count error in number of persons

Mennis J 1999 Using GIS spatial statistics and visual-ization to investigate environmental justice Paper pre-sented at the Joint Statistical Meetings American Statisti-cal Association Baltimore Md August 8-12 1999

Monmonier M 1982 Flat laxity optimization androunding in the selection of class intervals Cartograph-ica 19(1) 16-27

Monmonier M and G A Schnell 1984 Land use andland cover data and the mapping of population den-sity International Yearbook of CartograPhy 24 24-9

Moxey A and P Allanson 1994 Areal interpolation ofspatially extensive variables A comparison of alterna-tive techniques International journal of GeograPhicalInformation Systems 8(5) 479-87

138

Robinson A J Morrison P Muehrcke A Kimerlingand S Guptill 1995 Elements of cartograPhy 6th edNew York New York John Wiley amp Sons

Slocum T A 1999 Thematic cartography and visualimtionUpper Saddle River New Jersey Prentice Hall

SPSS Inc 1999 SPSS help venion 90 SPSS Incorpo-rated Chicago Illinois

US Geological Survey 1970National atlas of the United Statesof Ame1Ua us Geological Surveyashington DC

US Geological Survey 1993 National water summary1990-91 Hydrologic events and stream water qualityUS Geological Survey water supply paper 2400 USGeological Survey Washington DC

Wright I K 1936 A method of mapping densities ofpopulation The GeograPhical Review pp 103-10

CartograPhy and Geographic Information Science