Small basin modeling of snow water equivalence using binary regression tree methods

Biogeochemisîry of Seasonally Snow-Covered Catchments (Proceedings of a Boulder Symposium, July 1995). IAHS Publ. no. 228,1995. 129

Small basin modeling of snow water equivalence using binary regression tree methods

K. ELDER & J. MICHAELSEN Institute for Computational Earth System Science, University of California, Santa Barbara, California 93106, USA

J. DOZEER Institute for Computational Earth System Science and School of Environmental Science and Management, University of California, Santa Barbara, California 93106, USA

Abstract Binary regression tree methods were used to classify field data of snow water equivalence (SWE) into areas of similar accumulation based on physical parameters and solar radiation. All parameters and field measurements were registered to a 5-m resolution digital elevation model (DEM). Decision rules for the tree-based classifier were constructed from coregistered parameters of locations in the DEM containing field measurements. All grid cells within the basin were assigned SWE values based on the decision tree and the input parameters from each pixel. Tree-based methods are attractive for modeling SWE distribution controlled by a variety of processes because they allow the use of hierarchical data including: interval scale data (e.g. azimuth), nominal data (e.g. vegetation type), as well as ratio scale data (e.g. net radiation, elevation, and slope). Tree-based models explain a greater portion of the variance seen in the field data than any of the other methods explored by the authors. Analyses were carried out on data collected from the Emerald Lake basin located in Sequoia National Park, California, USA, during the 1987 water year.

INTRODUCTION

Water resources in the western United States are gaining attention as both our perception and reality point toward future shortages. Persons and organizations interested in agricultural, hydropower, municipal, and recreational water use are now showing keen interest in every drop flowing down western rivers. In many cases the rivers are over-allocated and the demand exceeds the supply. However, few studies have addressed the question of measurement and estimation of water stored in the mountains in the form of snow. This omission seems particularly odd given the large percentage of all water supplies first deposited in mid-latitude regions in the winter snowpack. Before we can accurately estimate how much water will flow out of mountain regions and when it will come, we need substantial improvements in our snowmelt modeling capabilities. Before we can improve these modeling capabilities, we must improve our ability to measure and characterize the spatial distribution of snow in the mountains.

130 K. Elder et al.

BACKGROUND

Attempts have been made to model the spatial distribution of snow water equivalence (SWE) in a number of environments (Steppuhn & Dyck, 1974; Granberg, 1979; Weir, 1979; Hosang & Dettwiler, 1991), but few of these studies involved alpine regions. In the Emerald Lake basin, a small Sierra Nevada alpine watershed, Elder et al. (1991) used topographic parameters and solar radiation to explain the distribution of SWE as measured in the field. The modeling was statistical in nature and used Bayesian maximum-likelihood unsupervised classification to partition the basin into areas of similar SWE. Although that work was partially successful and represented an improvement in our modeling capabilities, the results failed to explain a large percentage of the observed variance in the field measurements. This failure came from two sources : (1) we have an incomplete understanding of the physical processes controlling the distribution of snow, which explains the lack of adequate deterministic models and our adoption of a statistical approach, and (2) the model is linear, although we are trying to describe fundamentally nonlinear processes. Although snow accumulation shows a linear relationship with some controlling factors (e.g. radiation), it is clearly nonlinear with respect to others (e.g. elevation).

BINARY REGRESSION TREES

The study reported in this paper uses the same data set as in Elder et al. (1991), but we have adopted a statistical technique that is capable of relating independent variables to a response variable in a nonlinear or hierarchical manner. The result is a greatly improved fit between the field measurements and the modeled estimates.

The relatively new statistical technique of regression tree (binary decision tree) modeling was used in the present study. The technique has gained popularity over the last decade for a number of applications (e.g. Michaelsen et al., 1987, 1994). Binary decision trees or predictive regression trees of the type used in this study estimate values for a response variable, y, based on the measurements of a set of predictor variables x, (xm, m = 1,2,...) from a measurement space X. The tree is constructed by repeated partitioning of subsets of X into two descendent subsets or nodes, where X itself is the root node and the partitions end in a set of terminal nodes. Each terminal node is assigned a value for y by calculating the average of the observations of y which fall in that node. The partition or split at each node is made on the values in y conditionally on values in the sample vector x, based on a single variable inx. For ordinal or ratio scale data, splitting decisions are posed in the form: is xm < c? where c is within the domain of xm. For categorical variables, the decisions may be expressed as: is xm 6 S?, where S includes all possible combinations of subsets of the categories defined in xm. At each stage the split is chosen which most reduces the combined mean squared error of the two descendent nodes relative to that of the parent node.

In the present study these decisions take the form: is elevation < 3175 m or is net radiation < 98.5 W m"2? The categorical analog is: does the vegetation cover of xm

belong to the subset woody? A portion of the final decision set for a node in a binary regression tree may look like the following:

if (ELEVy > 3023 m) and (SLOPE,-, < 35°) and (VEG,-,- = 2) then the predicted value of SWE, is 2.98 m,

https://www.researchgate.net/publication/233531085_Snow_accumulation_and_distribution_in_an_Alpine_Watershed?el=1_x_8&enrichId=rgreq-6fd6ae78-016c-4fc7-8c26-e69c997485e9&enrichSource=Y292ZXJQYWdlOzIzMzUzMTA1MztBUzoxMDQ0OTI1MDY4MTI0MTZAMTQwMTkyNDM2MzA5MA==

https://www.researchgate.net/publication/233531085_Snow_accumulation_and_distribution_in_an_Alpine_Watershed?el=1_x_8&enrichId=rgreq-6fd6ae78-016c-4fc7-8c26-e69c997485e9&enrichSource=Y292ZXJQYWdlOzIzMzUzMTA1MztBUzoxMDQ0OTI1MDY4MTI0MTZAMTQwMTkyNDM2MzA5MA==

Small basin modeling of snow water equivalence 131

where ELEV, SLOPE, and VEG are the independent variables elevation, slope angle, and vegetation type, respectively; i and j are the coregistered coordinates of the variables. An actual tree constructed for S WE prediction is shown in Fig. 1.

A collection of such decision rules is arrived at through a technique referred to as recursive partitioning. Three elements must be defined before the sample data may be recursively partitioned into a binary decision tree: (1) method for determining the best split at each node, (2) basis for deciding when to continue or stop splitting a node, and (3) method for assigning class probabilities for each terminal node.

The details of these decisions are beyond the scope of this paper, but are explained in detail in the standard reference on classification and regression trees (Breiman et al., 1984). We have used the tree-based model implementation in the S-Plus mathematical language which follows closely the development in Breiman et al. ( 1984). Details of the S-Plus software are explained in Chambers & Hastie (1992).

Fig. 1 Regression tree results for the 18 April 1987 snow survey with ten SWE classes. Values in the ellipses represent the mean SWE value for all members of the subtree growing from that node. Values in the rectangular boxes represent the modeled SWE value from the field data satisfying the partition rules leading to that particular terminal node.

https://www.researchgate.net/publication/229703310_Classification_and_Regression_Trees?el=1_x_8&enrichId=rgreq-6fd6ae78-016c-4fc7-8c26-e69c997485e9&enrichSource=Y292ZXJQYWdlOzIzMzUzMTA1MztBUzoxMDQ0OTI1MDY4MTI0MTZAMTQwMTkyNDM2MzA5MA==

132 K. Elder et al.

STUDY SITE

Field data were collected in the Emerald Lake watershed, Sequoia National Park, California, located at 36°35'N, 118°40'W (see Fig. 2). Elevations within the gauged basin are between 2800 and 3416 m, with a relief of 636 m. The watershed area is about 120 ha, with a lake surface area of 2.85 ha. The basin has a north-facing aspect, and is flanked by nearly vertical cliffs on the south and west margins. A broad range of slopes and aspects are represented in this glaciated cirque. The topography and physiography are representative of small Sierran alpine watersheds. A detailed description of the basin can be found in Tonnessen (1991).

FIELD METHODS

Snow depth and density were measured independently in the field. Densities were measured in continuous profiles at 0.10-m increments in snow pits excavated for that purpose. Depths were taken using probes capable of measuring up to 10-m depths. Depth sample locations were located by randomly sampling the coordinates of a 25-m grid registered to a digital elevation model (DEM) of the basin. Locations of the points were transferred to orthographically corrected aerial photographs used by the field teams. Eleven surveys were completed during the 1986, 1987, and 1988 water years at about 2-week intervals following the day of peak accumulation. The number of points sampled varied between surveys and was based on an estimate of what the field teams could reasonably accomplish depending on snow depths and extent. Locations of snow pits were chosen to give a range of exposures and elevations representative of the basin and broad trends in snow density (see Fig. 2).

MODELING METHODS

Software used in the analyses included the GRASS geographic information system (GIS) (U.S. Army, 1993), ipw image processing software (Frew, 1988), and the S-Plus mathematical language (Becker et al., 1988; Chambers & Hastie, 1992). The GRASS GIS was used to calculate some of the independent variables (slope and aspect from the DEM). GRASS also served as a spatial database to store, retrieve, and manipulate independent and dependent variables. S-Plus contains programs specific to tree-based modeling and model construction and validation were conducted in S-Plus. Once the models were formulated, they were executed in GRASS to produce spatial maps of S WE for the relevant dates and desired constraints (e.g. number of S WE classes).

Independent and dependent variables

We attempted to choose independent variables that had a known (if undefined) physically based relationship with snow distribution. The variables chosen were elevation, slope, vegetation, substrate, and net solar radiation. Elevation was derived from a 5-m resolution DEM constructed for this study. Slope was calculated for each DEM cell from the elevations. Vegetation type was taken from a digitized map and simplified into

Small basin m

odeling of snow w

ater equivalence 133

c o

u

cc u j 3

2 O

in H

- OJ

o

o

z CO

< m

tu

< _i

a _i

< cc LU

LU

OC

< CL _l

< ATION

Z

< 5 SEQU

c .s •̂

S

snow pits o

to C

3

.2 « o o ent ]

a) eu

1—

co cati imbered lo

•a

CU

S

u

43

&

60

O

O

H

134 K. Elder et al.

six classes of roughly equal roughness characteristics. Substrates, or soil types, were simplified similarly into three classes. Net solar radiation was calculated using atmospheric parameters derived by LOWTRAN7 (Kneizys et al, 1988) applied to a model for distribution net solar insolation over rough topographic surfaces (Dozier, 1980). Modeled radiation was calibrated with two measurement sites in the basin. The independent variable for radiation was calculated by summing daily accumulated radiation for the 15th of each month from December through the month corresponding to the closest date of the survey. This variable provided the one parameter changing with respect to time relative to the period of study.

S WE is the dependent variable. S WE is the product of snow depth at a point multiplied by the density expressed as the percentage of the density of water. Point values of snow density measured in snow pits were interpolated based on date and location within the basin. Density is conservative and small errors result in the

o o o

.£, § -

o G

3 cr W

o c

Number of Terminal Nodes in Regression Tree

Fig. 3 Volume of SWE (m3) for each classification result from all decision trees in the 1987 water year: (a) 18 April; (b) 22 May; (c) 5 June. Solid horizontal line is the mean of all volumes from trees with 5 or more classes; dotted line is the mean volume from trees with 5 through 10 classes.


assumptions made in spatial distribution of this parameter. Point values of SWE were calculated by multiplying the interpolated density with each coregistered field measurement of snow depth.

From point to areal values of SWE

The five independent variables described above were selected for each coregistered point value of SWE. These data were applied in a regression tree model to select the critical independent variables and their hierarchical relationships to the dependent variable, SWE. Once the model was constructed, every cell in the basin could be assigned the appropriate SWE value based on the coregistered values of the independent variables found at each cell. This technique allowed relatively accurate assignment of SWE distribution to over 40 000 cells from only hundreds of values measured in the field. The routine was completed for each of the survey dates using the corresponding field data and appropriate radiation data, as well as the static variables of elevation, slope, vegetation, and substrate.

Snow-covered area

Although four surveys were completed in the basin, snow-covered area (SCA) could be mapped accurately for only three surveys, due to limited availability of satellite imagery and aerial photography. SPOT multispectral imagery at a 20-m resolution was used to map SCA for the first survey using well-documented image processing techniques for SCA (Dozier, 1989). The results were resampled and applied to the 5-m grid SWE estimates. Aerial photographs were scanned for the third and fourth surveys and SCA was mapped at the 5-m resolution using thresholding techniques verified by visual comparison to original photographs where snow-covered areas were easily delineated from snow-free portions, even in shadowed terrain. Only results of the three surveys having SCA estimates are reported.

RESULTS AND DISCUSSION

The SWE estimates based on maximum numbers of terminal nodes provide the best estimate of SWE based on several simple metrics, but these also give a complex heterogeneous spatial estimate that is difficult to apply in a spatially-distributed snowmelt model because of computational costs. These regression trees with maximum numbers of terminal nodes also represent and over-fit result, and fit evaluations (e.g. R2) are optimistic in their assessment of the ability of the model to describe or predict the result in nature. For these reasons, we have chosen trees with five to ten terminal nodes (or discrete SWE values) for evaluation in the present paper, and we chose the 10-node case for input to the snowmelt modeling discussed in the accompanying paper by Harrington et al. (1995, this volume).

Figure 3 shows the SWE volumes predicted by the regression-tree models for all three surveys and for the each model run from two to the maximum number of nodes. The maximum number of nodes varies between survey dates partly as a function of data

136 K. Elder et al.

heterogeneity and partly based on the number of observations. The mean of all modeled results with numbers of terminal nodes greater than five is shown as a solid line. The mean of the volumes with 5 through 10 SWE classes is shown as a dotted line. Trees with less than five classes were not considered because the mean SWE values depart markedly from the values predicted by the models with a greater number of nodes yet not overfit. Volumes generally increase with greater numbers of classes after about 10 terminal nodes (Fig. 3). This trend appears to follow because the field measurements from the deeper deposits that are outliers in larger groups become closer to expressing the mean of the smaller classes as further partitioning proceeds. It also appears that the increase in volume from the smallest numbers of classes may be the effect of these outliers over larger areas, in spite of their decreased effect on the mean given the larger number of members in each class.

The coefficient of determination (R2) increases with the number of SWE classes as expected (Fig. 4). However, the rapid decline in these values below five or six classes suggests that the classifications are rapidly losing the ability to delineate truly separate classes of snow distribution. For the 10-node regression trees, the model explained about 61% of the observed variation in SWE at the time of the first survey; about 58% and 48 % of the variance was explained for the third and fourth surveys, respectively. These coefficients of determination (R2) do not take into account any spatial effects, but do give an indication of relative model performance.

10 15 20 25 30 Number of Terminal Nodes

Fig. 4 Coefficient of determination (R2) results versus number of terminal nodes or SWE classes for all decision trees in the 1987 water year: 18 April (solid line), 22 May (dotted line), 5 June (dashed line).

RM

D (

met

ers

squa

red)

14

.0

15.0

16

.0

17.0

\

\ \ \

5

"""

10 15 20 25 30 Number of Terminal Nodes

Fig. 5 Cross validation results from 18 April survey showing the number of SWE zones or terminal nodes versus the residual mean deviance (RMD).


Cross validation of the data set used to grow the trees also suggests that the optimal number of classes or tree nodes is somewhere between 5 and 15. Figure 5 has the cross-validated number of classes plotted against tree deviance, showing a minimum deviance or best fit at about 5 to 10 classes and deterioration of the fit outside this range, particularly in the direction of smaller sized trees. This figure implies lack of fit in the trees with few terminal nodes and it exposes the optimistic over-fitting of the trees with large numbers of classes not evident in Fig. 4.

Summary statistics of the three surveys are listed in Table 1. Snow-covered area values were about 90, 50, and 35% of the basin for the first, second, and third surveys, respectively. The steep nature of the basin (mean slope of 30°) precludes accumulation on a significant portion of the basin even on extreme accumulation years such as 1986 water year. The peak accumulation SCA value in water year 1986 was about 90% even though the accumulation in the region was at least 150% of normal (Elder et ai, 1991).

Table 1 Water year 1987 statistics for snow-covered area (SCA) and snow water equivalence (SWE). SWE is the basin mean SWE value. SWEJQJ is the volume of water stored as snow in the basin in cubic meters. SWE difference (ASWE) is the percent deviation between measured and modeled results.

Survey date

18 April 22 May 5 June

SCA (%)

89% 50% 35%

Measured SWE (m)

0.69 0.46 0.42

Modeled SWE (m)

0.67 0.48 0.38

Measured S W E J Q J

(m3)

734 400 274 800 175 400

Modeled SWET0T

(m3)

720 700 282 300 158 400

ASWE (%)

-1.9% +2.7% -10.7%

n

224 150 90

SWE volume is listed in Table 1 for measured and modeled results: measured refers to the mean of the measured point SWE values (SWE) multiplied by SCA, modeled SWE volume is based on the mean of the 5 though 10 class regression tree results. The range of results is used for the mean of modeled SWE volume because there is not a clear best-tree result; the best tree may vary depending on the application. The values

of mean basin SWE (SWE) in Table 1 are the mean of the depth measurements

multiplied by their corresponding densities. Modeled SWE in Table 1 are the modeled SWE volumes divided by SCA. The values are close for the first two surveys, but deviate by almost 11 % for the final survey date. It is not clear which value is closer to the truth, but the modeled result does take factors controlling SWE distribution into account and should, therefore, be more accurate.

Figure 4 shows a decrease in the R2 values for the third survey. The number of observations becomes a limiting factor in model performance as well; the average number of members in each terminal node drops from 45, to 30 and 18 in the first, second, and third surveys, respectively, for the corresponding five-class trees. Uncertainty in the probabilities of terminal node values decline relative to these numbers because uncertainty is inversely proportional to within-node sample size. Regardless of which value is more accurate, the modeled results of SWE are of more use because they are spatially distributed. These SWE values should produce better results in any snowmelt modeling effort when compared to a basin-wide mean value.

Our previous attempts to model SWE in alpine environments explained roughly 30% of the observed variance in the best-case results (Elder et al., 1991). Although those

138 K. Elder et al.

results were preferable to other conventional methods such as multiple linear regression, they still left unexplained an unsatisfactory portion of the variance in the field observations. The worst case result for regression trees of ten or greater terminal nodes explains about 50% of the variance (R2 = 0.48) for the final survey, and close to 60% of the variance is accounted for in both of the preceding surveys (7?2 = 0.58 and 0.61).

An advantage of the regression tree method and reason partially explaining the improved results follows from the fact that the tree method incorporates the SWE observations into the model formulation. In other words, the distribution of SWE is used in part to divide the basin into areas of similar snow properties. This method contrasts the Bayesian classification used in Elder et al. (1991) where the basin was partitioned on physical characteristics alone, then SWE applied to the mapping after the fact.

Another major advantage of the tree-based method is the ease of interpretation of results. Figure 1 shows the final tree and decision splits for the 10-node 18 April result. Radiation is the primary split and as expected, the right side of the tree shows that the portions of the basin receiving the maximum net radiation contain the smallest SWE values. The left side of the tree can be generalized as areas of lower net radiation containing more SWE, dependent on soil type, slope, and elevation, where greater elevations accumulate more snow and steeper slopes accumulate less snow. Interpretation of the explicit formulation of multiple regression or maximum likelihood classification is complex at best. Articulation of the mathematical relationship to the physical observations, dependent and independent, and explanation to a user after formulation is onerous if not futile.

CONCLUSIONS

The use of regression trees in distributing snow over complex alpine terrain represents a major improvement over our previous attempts using conventional Bayesian classification methodologies. The method is successful partly because the measured values of SWE are explicitly used in model formulation and because the modeling technique is capable of describing nonlinear or hierarchical relationships between the independent and dependent variables. The spatial distribution of the estimated SWE is not optimal or without error, but the accuracy of the estimates far outweighs any other technique we have explored and provides a best-case starting point for applications such as spatially-distributed snowmelt modeling.

Acknowledgments D. Clow, R. Kattelmann, M. Williams, and numerous others helped generously with field work. Discussions with R. Harrington improved the analyses. Software written by J. Frew was critical to the entire project. The work was funded by the NASA EOS program.

REFERENCES

Becker, R., Chambers, J. & Wilks, A. (1988) The New S Language, A Programming Environment for Data Analysis and Graphics. Wadsworth & Brooks, Pacific Grove, California.

Breiman, L., Friedman, J., Olshen, R. & Stone, C. (1984) Classification and Regression Trees. Wadsworth & Brooks, Pacific Grove, California.


Chambers, J. & Hastie, T. (eds.) (1992) Statistical Models in S. Wadsworth & Brooks, Pacific Grove, California. Dozier, J. (1980) A clear-sky spectral solar radiation model for snow-covered mountainous terrain. Wat. Resour. Res. 16,

709-718. Dozier, J. (1989) Spectral signature of alpine snow cover from the Landsat Thematic Mapper. Remote Sen. Environ. 28,

9-22. Elder, K., Dozier, J. & Michaelsen, J. (1991) Snow accumulation and distributionin an alpine watershed. Wat. Resour. Res.

27, 1541-1552. Frew, J. (1990) The image processing workbench. Ph.D. thesis, Department of Geography, University of California, Santa

Barbara. Granberg, H. (1979) Snow accumulation and roughness changes through winter at a forest-tundra site near Schefferville,

Quebec. In: Proceedings, Modeling Snow Cover Runoff (eà. by S. Colbeck & M. Ray), 83-92. U.S. Army Cold Regions Research and Engineering Laboratory, Hanover, New Hampshire.

Harrington, R. F., Elder, K. & Bales, R. C. (1995) Distributed snowmelt modeling using a clustering algorithm. In: Biogeochemistry of Seasonally Snow-Covered Catchments (ed. by K. Tonnessen, M. W. Williams & M. Tranter) (Proc. Boulder Symp., July 1995). IAHS Publ. no. 228.

Hosang, J. &Dettwiler, K. (1991) Evaluation of a water equivalent of snow cover map in a small catchment area using a geostatisticalapproach. Hydro!. Processes5, 283-290.

Kneizys.F., Shettle, E., Abreu, L., Chetwynd, J., Anderson, G., Gallery, W., Selby, J. & Clough, S. (1988) Users Guide to LOWTRAN7. Report AFGL-TR-88-0177, Air Force Geophysics Laboratory Bedford, Massachusetts.

Michaelsen, J., Davis, F. & Borchert, M. (1987) Non-parametric methods for analyzing hierarchical relationships in ecological data. Ceonoses, 97-106.

Michaelsen, J., Schimel,D., Friedl, M., Davis, F. &Dubayah, R. (1994) Regression tree analysis of satellite and terrain data to guide vegetation sampling and surveys. J. Veg. Sci. 5, 673-686.

Steppuhn, H. & Dyck, G. (1974) Estimating true basin snow cover. In: Advanced Concepts and Techniques in the Study of Snow and Ice Resources, 314-328. National Academy of Sciences, Washington, D.C.

Tonnessen, K. (1991) The Emerald Lake Watershed Study: introduction and site description. Wat. Resour. Res. 27, 1537-1539.

U.S. Army (1993) GRASS 4.1 User's Reference Manual. U.S. Army Corps of Engineers, Construction Engineering Research Laboratories.

Weir, P. (1979) Topographic influences on snow accumulation at Mount Hutt. M.A. thesis, University of Canterbury, New Zealand.

Documents

Small basin modeling of snow water equivalence using binary regression tree methods