Animal Modeling
The Nebraska Gap Analysis Project (NE-GAP) has used recursive partitioning to develop statistical models that relate species occurrence data (in the form of museum voucher specimens or curated surveys) with a suite of environmental variables (Henebry et al. 2001). Here we describe the results of using different kinds of land cover data in the development of habitat models for ten bird species.
To generate the habitat models we used Q
UEST (Quick, Unbiased, & Efficient Statistical Trees; Loh and Shih 1997), a recursive partitioning algorithm similar to CART (Classification & Regression Trees; Breiman et al. 1984, De'ath and Fabricius 2000). QUEST has several advantages for habitat modeling: it is much faster than CART, variable selection is unbiased, handles categorical predictor variables with many categories, and uses automated cross-validation (Shih 2002). The motivation for using this strategy is two-fold. Not only are the resulting trees of decision points and values that form the models understandable, debatable, and tunable, the nonparametric modeling can handle the multimodalities likely to be found in species occurrence data.
Species occurrence data was gathered from route-level composites of the USGS Breeding Bird Survey (BBS; www.pwrc.usgs.gov/bbs) and circle composites of The National Audubon Society’s Christmas Bird Count (CBC; www.audubon.org/bird/cbc/) for the period 1970-2000. Given the intensive repeated observations, if a species was not reported along a sampling unit during the study period, it was considered absent.
The suite of environmental variables (land cover, climate, soils, terrain) included in the modeling process are described in Henebry et al. (2001). Modeling was performed across a hexagonal grid produced by the EPA EMAP program with a cell resolution of about 40 km2 within Nebraska. Each variable was rescaled from its raster resolution (30 m for land cover, soils, and terrain data and 1500 m for climate variables) to the coarser hexagonal coverage. Continuous variables were rescaled by area-weighted averaging. Categorical variables were represented as a compositional vector. All environmental variables contained within the hexagons that intersected BBS routes or CBC circles were associated with the species occurrence data at those sampling locations.
Two separate land cover classifications were used: the NE-GAP land cover product and the USGS National Land Cover Data (NLCD). We also included the several variables from the National Land Cover Pattern Data (NLCPD), which is based on the NLCD (Riitters et al. 2000). We used five of the landscape metrics in the NLCPD: contagion, forest fragmentation, forest-area density, human-use index, and land cover diversity (cf. Riitters et al. 2000). Spatial filters or fixed-area windows were applied to the NLCD map to generate the NLCPD maps. A pixel in a pattern map incorporates information from the surrounding 65.61 ha (27 x 27 window) in the original NLCD map (Riitters et al. 2000).
The pattern metrics were reclassified from continuous indices on the unit interval [0, 1] to a categorical scheme that indicates landscape connectivity. Forest-area density and human-use index were regrouped using the critical thresholds (CT) predicted from percolation theory for random maps using various neighborhood rules (Turner et al. 2001). As neighborhood size grows, the CT for the emergence of high landscape connectivity drops. The 4, 8, 12, and 24-neighbor rules were calculated individually for these metrics and given a value of 0 (no value), 1 (below the CT), or 2 (above the CT). A Landscape Connectivity Indicator (LCI) was produced using the four neighborhood rules for both the forest-area density and human-use index (Table 1). For example, forest-area density LC class 1 portrays wooded areas that are highly fragmented or isolated, since all of the values are below the CTs. As LCI class increases, the CT decreases, and the likelihood of landscape connectivity increases. The other pattern metrics (contagion, land cover diversity, forest fragmentation) were reclassified by quartiles. All landscape pattern indices were represented at compositional vectors within hexagons.
|
Neighborhood Rules |
|||
|
LCI Class |
4 |
8 |
12 |
24 |
|
0 |
0 |
0 |
0 |
0 |
|
1 |
B |
B |
B |
B |
|
2 |
B |
B |
B |
A |
|
3 |
B |
B |
A |
A |
|
4 |
B |
A |
A |
A |
|
5 |
A |
A |
A |
A |
Ten bird species native to Nebraska were considered. Of the six woodland species modeled, two species—gray catbird (Dumetella carolinensis) and song sparrow (Melospiza melodia)—utilize riparian areas, red-breasted nuthatch (Sitta canadensis) is found primarily in coniferous woodlands, and the remaining three species—eastern wood-pewee (Contopus virens), great crested flycatcher (Myiarchus crinitus), and red-bellied woodpecker (Melanerpes carolinus)—occur mainly in deciduous woodlands. We modeled two wetlands species—black tern (Chlidonias niger) and black-crowned night heron (Nycticorax nycticorax)—and two grassland species—eastern meadowlark (Sturnella magna) and greater prairie-chicken (Tympanuchus cupido).
Species were modeled using the following land cover information: (1) the NE-GAP land cover product; (2) the NLCD alone; or (3) the NLCD plus the NLCPD. Occurrence data and associated environmental variables for each species were submitted to QUEST. Resulting statistical trees were trimmed or pruned interactively by querying the hexagonal coverage of environmental variables to evaluate the sensitivity of the tree splits and assess model generality. The final tree served as the wildlife-habitat relationship model. It was inverted to produce the predicted habitat distributions for each species. Model fitness was evaluated in two ways: the proportion of the occurrences explained and the visual appearance of the predicted range distribution.
Half of the NLCD models showed no significant difference from the NE-GAP models, and the other half exhibited worse fits (Table 2). Inclusion of the landscape pattern variables degraded the predicted range distribution in most cases. However, a forest-area density class improved the range predictions for one woodlands bird (great crested flycatcher) and one grasslands bird (greater prairie-chicken). Although land-cover diversity was the pattern variable most frequently selected by QUEST, it failed to improve range predictions. Inclusion of the human-use index, which is keyed to agriculture land use, also did not yield improvements over the NE-GAP model. Neither contagion nor forest fragmentation was selected for inclusion in any model.
|
Species |
NLCD |
NLCD + NLCPD |
Habitat Type |
|
|
Eastern Meadowlark |
- |
- |
Grasslands |
|
|
Greater Prairie-Chicken |
- |
+ |
Grasslands |
|
|
Black Tern |
NC |
- |
Wetlands |
|
|
Black-crowned Night Heron |
- |
- |
Wetlands |
|
|
Eastern Wood-Pewee |
NC |
NC |
Woodlands |
|
|
Gray Catbird |
NC |
NC |
Woodlands |
|
|
Great Crested Flycatcher |
- |
+ |
Woodlands |
|
|
Red-bellied Woodpecker |
- |
- |
Woodlands |
|
|
Red-breasted Nuthatch |
NC |
NC |
Woodlands |
|
|
Song Sparrow |
NC |
NC |
Woodlands |
|
1. The land cover classification scheme does indeed make a difference in habitat modeling. Models developed using the NLCD alone performed as well as or worse than the models developed with the NE-GAP land cover. The principal reason for this performance difference is the greater thematic resolution available in the NE-GAP land cover. The NLCD uses 21 classes for the entire conterminous US; in contrast, NE-GAP uses 20 classes in Nebraska alone. In the NLCD, over half of Nebraska is assigned to the "Grassland/Herbaceous" cover type. This broad brush obliterates distinctions between grassland communities that are very different in terms of species composition, canopy structure, and net primary productivity. The additional discrimination among grasslands communities produces more specific habitat models that yield predicted ranges that are more restricted geographically.
2. Spatial information available through the NLCPD can provide useful additional variables for the modeling process. However, their utility needs to be evaluated on an individual basis. Inclusion of our Landscape Connectivity Indicator based on the NLCPD forest-area density variable yielded improvements in two cases. However, for most species inclusion of landscape pattern variables failed to improve and even degraded range predictions.
3. Developing habitat models using statistical trees generated from species occurrence data and environmental variables can lend a greater degree of objectivity to modeling process, but there is still considerable subjectivity in the pruning stage that is necessary for model generality.
This work was supported in part through the GAP Research Project Evaluating the Use of Statistical Decision Trees for Modeling Avian Habitat and Regional Range Distribution from Occurrence Data and Environmental Variables.
Breiman, L., J.H. Friedman, R.A. Olshen, and C.J. Stone. 1984. Classification and regression trees. Wadsworth and Brooks/Cole, Monterey, California. 358 pp.
De’ath, G., and K.E. Fabricius. 2000. Classification and regression trees: a powerful yet simple technique for ecological data analysis. Ecology 81:3178-3192.
Henebry, G.M., B.C. Putz, and J.W. Merchant. 2001. Modeling reptile and amphibian range distributions from species occurrences and landscape variables. Gap Analysis Bulletin 10:22-24.
Loh, W.-Y., and Y.-S. Shih. 1997. Split selection methods for classification trees. Statistica Sinica 7:815-840.
Riitters, K.H., J.D. Wickham, J.E. Vogelman, and K.B. Jones. 2000. National land-cover pattern data. Ecology 81:604.
Turner, M.G., R.H. Gardner, and R.V. O’Neill. 2001. Landscape ecology in theory and practice: Pattern and process. Springer-Verlag New York, Inc., New York. 401 pp.
Shih, Y.-S. 2002. QUEST User Manual. Department of Mathematics, National Chung Cheng University, Taiwan. April 17, 2002.
Return to Table of Contents