applications

Developing a scientifically rigorous framework for enhancing and evaluating vertebrate models

Edward J. Laurent, Steven G. Williams and Alexa J. McKerrow

Biodiversity and Spatial Information Center, North Carolina State University, Raleigh, North Carolina

Introduction

The Southeast Regional Gap Analysis Project (SEReGAP) is exploring the potential for enhancing GAP vertebrate models to address specific needs of the conservation community. Our goal was to develop a framework for modeling accurate and precise estimates of habitat suitability, population sizes and viability for multiple priority species at a small spatial grain (e.g., 1 ha) over large regions (i.e., multiple states). A pilot project (Williams and McKerrow 2005) has resulted in a scientifically rigorous foundation for these spatially explicit predictions. Specific activities have focused on 1) reclassifying land cover maps into avicentric classes; 2) developing an extensive database of spatially explicit species-habitat-population relationships; 3) developing toolboxes for fitting predictive models; 4) conducting sensitivity analyses to determine which species distribution model inputs have the greatest influence in predicting species' occurrences areas; 5) locating and integrating existing bird survey records from multiple sources into a spatially explicit database; 6) extrapolating bird species occurrences using occurrence data and inductive modeling techniques; and 7) validating SEReGAP occurrence and suitability models.

 Pilot project objectives and a list of focal species were established at a meeting in Asheville, NC. Meeting participants included regional researchers, managers and bird species experts. It was decided to limit the pilot study region to the portion of the Appalachian Mountains Bird Conservation Region falling within North Carolina and focus on 6 forest bird species to reduce computer processing time and streamline the pilot project. Focal species included those listed as a conservation priority by Partners in Flight (Rich et al. 2004). Because most disturbances in the study region are typically associated with forest loss or conversion due to timber harvesting and residential development, particular species were selected whose habitat associations differed within a wide range of forest ages and composition. Focal species included: Acadian Flycatcher (Empidonax virescens), Golden-winged Warbler (Vermivora chrysoptera), Hooded Warber (Wilsonia citrina), Scarlet Tanager (Piranga olivacea), Worm-eating Warbler (Helmitheros vermivorus), and Yellow-breasted Chat (Icteria virens). Once suitability models are developed for the pilot species in the mountains of North Carolina, the methods will be applied to these and other species over the entire SEReGAP region.

Avicentric Classes

Landscape analyses of species-habitat associations should be conducted using maps representing functional differences in how focal species perceive and respond to landscape composition and structure (Morrison et al. 1998, Wiens et al. 2002). For this reason, U.S. Fish and Wildlife Service biologists generated a list of avicentric land cover classes to spatially partition the southeast region based on published habitat descriptions and commonly used terms. The avicentric classes were cross-walked to various vegetation maps including National Land Cover Data ( NLCD; Homer et al. 2004). However, existing vegetation maps are not expected to provide the classification precision and predictive accuracy of soon to be released detailed GAP vegetation maps based on NatureServe ecological systems (Comer et al. 2003). During the interim, avicentric classes were mapped for the study region using a combination of two available data layers; we intersected NLCD classes with 13 terrestrial landform classes. The resulting groups were aggregated and/or reclassified into avicentric classes (Figure 1).

Figure 1. 3-D projection of avicentric land cover classes in the Asheville region of North Carolina. Pixels are 30-m x 30-m.

Literature Review Database

SEReGAP has developed a new method for documenting literature reviews by multiple persons that creates synergy from their activities through the use of a relational database. Research details recorded in the database are spatially and temporally explicit and divided into modular units. Each record can therefore be queried for information describing a study’s date, location, method of data collection, species studied, land cover types and landscape relationships (e.g., patch size, distance to water), as well as qualitative descriptions of habitat suitability and quantitative demographic parameters (e.g., density, daily nest survival) under those conditions. Efforts are currently underway to make the database forms and queries available on-line so that ecologists may work together to limit redundant efforts and build a robust repository of research results descriptions for macroecological investigations and predictive modeling.

Predictive Model Toolboxes 

Spatial models based on habitat affinities derived from the literature review database and expert review are being created for each of the priority bird species. The models consist of multiple map algebra expressions defined within ArcGIS 9.0 toolboxes (Environmental Systems Research Institute Inc., Redland, CA). Similar to the methods presented by Larson et al. (2003), data layers used to predict presence/absence of species in traditional Boolean GAP models are scored by suitability levels ranging from 0 to 1. Suitability scores will be determined through an evaluation of relevant research results summarized in the above mentioned literature review database. For example, if a species is predicted as present given a particular land cover class and distance to water, both of these inputs are scored from 0 to 1. Scores are based on a review of the species’ response to these conditions in terms of standardized variations in density, nesting success, predation rates, etc. Categorical variables (i.e., avicentric classes and their categorical modifiers) are ranked as a discrete score.  The suitability of continuous variables (e.g., elevation, patch size, distance to water) are fit to one of many possible response curves. The ranked data layers are then combined using habitat suitability modeling techniques (see Larson et al. 2003) to describe spatial gradients of habitat suitability for focal species across the study region.

Sensitivity Analysis

Each ranked data layers included in habitat suitability models will be assessed for its influence on modeled occupancy areas. For example, if a species known to occur within 100m of water then this occupancy rule will be tested for its independent contribution to the total area classified as species presence by the model. Independent contributions of each model parameter will be summarized for all species to identify parameters commonly making large contributions to model predictions. Similarly, data layers that show no significant contribution to the model’s predictions can be dropped, thereby, reducing the overall complexity while retaining a similar level of accuracy.

Locate and Integrate Data

In order to validate SEReGAP presence/absence maps and create a repository of data for the development of more sophisticated data-driven models, occurrence data are being collected and organized for the priority species throughout a three state area (North Carolina, South Carolina, and Georgia). Data sources so far include NC Wildlife Resources Commission, National Park Service, state breeding bird atlases, the USGS Breeding Bird Survey, Natural Heritage Programs ' element occurrences, U.S. Forest Service R8Bird data, the Monitoring Avian Productivity and Survivorship program, museum records and any other digital data sets that may become available. These diverse data are imported into a relational database to derive spatially and statistically appropriate datasets for model creation and validation. 

Inductive Modeling

Several powerful techniques have recently become available for extrapolating wildlife distribution patterns using relationships between locations where species have been observed and mapped environmental conditions at those locations. Important predictors for individual species can be identified by several methods including principal components analysis, hierarchical partitioning (Mac Nally 2002), classification and regression tree analysis (CART; De'Ath and Fabricius 2000), as well as expert opinion. Using these new modeling techniques, locations that have not been surveyed will be labeled for species presence or absence based on their "similarity" to locations within the survey data repository. However, the model algorithms differ in how "similarity" is defined as well as their predictive accuracies and interpretability of the mechanisms behind the predicted patterns. For these reasons, SEReGAP will investigate the appropriateness of these techniques under different conservation objectives. Depending on data availability, some species will be selected for model development and evaluation. Only species with adequate data quality, quantity, and dispersion throughout the study area will be modeled. However, the "gap" in data availability for other species will be noted as a future research priority. Several inductive modeling techniques may be explored for each species including: DOMAIN (Carpenter et al. 1993), PHASE1 (Laurent et al. 2005), multiple logistic regression, CART, and maximum entropy (Phillips et al. 2004). Each of these techniques has advantages and disadvantages which could make them suitable for a particular species or data set. In addition, multiple models may be used for each species providing a more robust evaluation of habitat suitability including the possible use of deductive rules such as land cover use.

Model Validation

Species occurrence records from the survey data repository will be used to validate suitability and inductive models. Because the occurrence records were obtained using different sampling methods over various spatial scales, the data will be stratified into multiple categories and each category will be used to validate predictions in different ways. For example, species occurrences within 100-m radius point count surveys will be compared to occurrence predictions and/or suitability estimates summarized over a spatially representative area. 

Conclusions

The utility and popularity of GAP mapping products developed are matched by the chorus of calls for more products of greater detail and information content. This pilot project is a response to those calls by developing a scientifically rigorous foundation for enhancing the utility of GAP vertebrate models. Regional habitat suitability relationships and maps for a subset of priority species could better inform resource managers when prioritizing areas for conservation activities. Sensitivity analysis will help refine the predictions while existing data sources will be used to evaluate their accuracy and posit new hypotheses of species-habitat relationships. Furthermore, we approach the mapping of habitat suitability as an intermediate step towards more difficult predictions of population size and viability across large regions.

Literature Cited

Carpenter, G., A.N. Gillson, and J. Witmer. 1993. DOMAIN: a flexible modeling procedure for mapping potential distributions of plants and animals. Biodiversity and Conservation 2:667-680.

Comer, P. D., Faber-Langendoen, R. Evans, S. Gawler, C. Josse, G. Kittel, S. Menard, M. Pyne, M. Reid, K. Schulz, K. Snow, and J. Teague. 2003. Ecological Systems of the United States: A Working Classification of U.S. Terrestrial Systems. NatureServe, Arlington, VA.

De'Ath, G. and K. E. Fabricius (2000). Classification and regression trees: a powerful yet simple technique for ecological data analysis. Ecology 81(11): 3178-3192 

Homer, C., C. Huang, L. Yang, B. Wylie and M. Coan. 2004. Development of a 2001 National Landcover Database for the United States. Photogrammetric Engineering and Remote Sensing 70(7): 829-840

Laurent, E.J., H. Shi, D. Gatziolis, J.B. LeBouton, M. Walters and J. Liu. (2005) Using the spectral and spatial precision of satellite imagery for analyses of wildlife distribution patterns. Remote Sensing of Environment 97:249-262.

Larson, M. A., W. D. Dijak, F. R. Thompson III, and J. J. Millspaugh. 2003. Landscape-level Habitat Suitability Models for Twelve Wildlife Species in Southern Missouri. USDA Forest Service General Technical Report NC-233.

Morrison, M. L., B. G. Marcot, and R. W. Mannan. 1998. Wildlife-Habitat Relationships: Concepts and Applications. The University of Wisconsin Press, Madison.

Mac Nally, R. C. 2002. Multiple regression and inference in ecology and conservation biology: further comments on identifying inportant predictor variables. Biodiversity and Conservation 11: 1397-1401.

Phillips, S.J., J. Dudik, and R. E. Schapire. 2004. A maximum entropy approach to species distribution modeling. In Brodley, Carla E., editor. Machine Learning, Proceedings of the Twenty-first International Conference (ICML 2004), Banff, Canada, July 4-8, 2004. ACM Press, New York, NY

Rich, T. D., C. J. Beardmore, H. Berlanga, P. J. Blancher, M. S. W. Bradstreet, G. S. Butcher, D. W. Demarest, E. H. Dunn, W. C. Hunter, E. E. Iñigo-Elias, J. A. Kennedy, A. M. Martell, A. O. Panjabi, D. N. Pashley, K. V. Rosenberg, C. M. Rustay, J. S. Wendt, T. C. Will. 2004. Partners in Flight North American Landbird Conservation Plan. Cornell Lab of Ornithology. Ithaca, NY.

Wiens, J. A., B. Van Horne, and B. R. Noon. 2002. Integrating landscape structure and scale into natural resources management. Pages 23-67 in J. Liu and W. W. Taylor, editors. Integrating Landscape Ecology into Natural Resources Management. Cambridge University Press, New York.

Williams, S. G. and A. J. McKerrow. Refining Southeast Regional GAP models for use in regional bird conservation planning: A pilot project. Gap Analysis Bulletin No. 13  

Appendix. Avicentric classes developed by U. S. Fish and Wildlife Service biologists for describing functional landscape heterogeneity in the southeastern U.S. for landbirds.      

Avicentric Classes

Level 1

Level 2

Level 3

Code

Eastern Grasslands

I

 

 

10000

Tamaulipan prairie

I

A

 

10100

Tall grass

I

B

 

10200

Meadows/ Florida and Georgia prairies

I

C

 

10300

Agricultural and cropland

I

D

 

10400

Pasture

I

E

 

10500

Rank annuals

I

F

 

10600

Freshwater Wetland Communities

II

 

 

20000

Non-forested

II

A

 

20100

Freshwater emergent marsh

II

A

1

20101

Bogs/fens/ephemeral wetlands

II

A

2

20102

Mudflats/sandbars

II

A

3

20103

Forested

II

B

 

20200

Bottomland hardwood

II

B

1

20201

Cypress-tupelo

II

B

2

20202

Atlantic white cedar

II

B

3

20203

Pocosin/Carolina bays

II

B

4

20204

Riparian

II

C

 

20300

Open Fresh Water

II

D

 

20400

Eastern Shrub-Scrub

III

 

 

30000

Xeric Florida scrub

III

A

 

30100

Tamaulipan scrub

III

B

 

30200

Interior cedar/pine/oak barrens & glades

III

C

 

30300

Appalachian balds

III

D

 

30400

Early-successional hardwood and pine

III

E

 

30500

Cliffs, domes, outcrops

III

F

 

30600

Manmade/disturbed (e.g. hedgerows, ROWs, old fields)

III

G

 

30700

Freshwater shrub/scrub

III

H

 

30800

Coastal Communities

IV

 

 

40000

Maritime shrub/scrub

IV

A

 

40100

Maritime forest

IV

B

 

40200

Chenier/oak motte

IV

C

 

40300

Estuarine emergent marsh

IV

D

 

40400

Beaches and dunes

IV

E

 

40500

Tidal mudflats

IV

F

 

40600

Coastal prairie

IV

G

 

40700

Mangroves

IV

H

 

40800

Pine Communities

V

 

 

50000

Pine flatwoods

V

A

 

50100

Pine savanna

V

B

 

50200

Xeric pine scrub

V

C

 

50300

Pine plantations

V

D

 

50400

Other pine forest

V

E

 

50500

Other pine forest - natural

V

E

1

50501

Pine sandhills

V

F

 

50600

Upland Hardwood/ Pine Hardwood Communities

VI

 

 

60000

Spruce-fir

VI

A

 

60100

Northern hardwood

VI

B

 

60200

Mixed mesophytic (cove hardwood)

VI

C

 

60300

Hemlock/white pine/hardwood

VI

D

 

60400

High elevation oak/oak-pine

VI

E

 

60500

Mixed hardwoods

VI

F

 

60600

Dry mixed hardwoods

VI

F

1

60601

Mesic mixed hardwoods

VI

F

2

60602

Pine hardwoods

VI

G

 

60700

Oak-cedar

VI

H

 

60800

Oak savanna

VI

I

 

60900

Hardwood plantation

VI

J

 

61000

Tropical hardwoods

VI

K

 

61100

Pelagic

VII

 

 

70000

Continental shelf

VII

A

 

70100

Deep open water

VII

B

 

70200

Gulf stream

VII

C

 

70300

Cities/Towns/Suburbs

VIII

 

 

80000

Residential

VIII

A

 

80100

Commercial Urban

VIII

B

 

80200

Airfields/golf courses/cemeteries

VIII

C

 

80300

Additional Classes

IX

 

 

90000

Quarry/ strip mines/ gravel pits

IX

A

 

90100

 

Return to Table of Contents