Skip Navigation


ICES Journal of Marine Science: Journal du Conseil Advance Access originally published online on February 28, 2008
ICES Journal of Marine Science: Journal du Conseil 2008 65(5):742-745; doi:10.1093/icesjms/fsn021
This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (PDF) Freely available
Right arrow All Versions of this Article:
65/5/742    most recent
fsn021v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Reusser, D. A.
Right arrow Articles by Lee, H.
Right arrow Search for Related Content
PubMed
Right arrow Articles by Reusser, D. A.
Right arrow Articles by Lee, H., II
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© 2008 International Council for the Exploration of the Sea. Published by Oxford Journals. All rights reserved. For Permissions, please email: journals.permissions@oxfordjournals.org

Predictions for an invaded world: a strategy to predict the distribution of native and non-indigenous species at multiple scales

Deborah A. Reusser1 and Henry Lee, II2

1 US Geological Survey, Western Fisheries Research Center, Oregon State University, 2111 NE Marine Science Drive, Newport, OR 97365, USA
2 US EPA, ORD, NHEERL, Western Ecology Division, 2111 NE Marine Science Drive, Newport, OR 97365, USA

Correspondence to D. A. Reusser: tel: +1 541 8674045; fax: +1 541 8674049; e-mail: dreusser{at}usgs.gov

Reusser, D. A., and Lee II, H. 2008. Predictions for an invaded world: a strategy to predict the distribution of native and non-indigenous species at multiple scales. – ICES Journal of Marine Science, 65: 742–745.

Habitat models can be used to predict the distributions of marine and estuarine non-indigenous species (NIS) over several spatial scales. At an estuary scale, our goal is to predict the estuaries most likely to be invaded, but at a habitat scale, the goal is to predict the specific locations within an estuary that are most vulnerable to invasion. As an initial step in evaluating several habitat models, model performance for a suite of benthic species with reasonably well-known distributions on the Pacific coast of the US needs to be compared. We discuss the utility of non-parametric multiplicative regression (NPMR) for predicting habitat- and estuary-scale distributions of native and NIS. NPMR incorporates interactions among variables, allows qualitative and categorical variables, and utilizes data on absence as well as presence. Preliminary results indicate that NPMR generally performs well at both spatial scales and that distributions of NIS are predicted as well as those of native species. For most species, latitude was the single best predictor, although similar model performance could be obtained at both spatial scales with combinations of other habitat variables. Errors of commission were more frequent at a habitat scale, with omission and commission errors approximately equal at an estuary scale.

Keywords: ecological niche modelling, geographic scale, habitat modelling, non-indigenous species, non-parametric multiplicative regression, Northeast Pacific

Received 14 June 2007; accepted 21 January 2008; advance access publication 28 February 2008.


    Introduction
 Top
 Introduction
 Methods
 Results and discussion
 Conclusions
 References
 
Many new habitat-modelling techniques are emerging in the environmental sciences for predicting distributions of plants and animals. These new techniques have been developed by conservation biologists to identify critical habitats for threatened and endangered species (Peterson, 2001), and in invasion biology to identify areas at risk of invasion (Peterson and Vieglais, 2001; Herborg et al., 2007). One limitation has been that the data available for these types of modelling exercises have been sparse for many species over large geographic areas. Recently, however, museums, universities, and government agencies have been collating and distributing large biological and environmental datasets on the Internet, making it possible to apply the new modelling techniques to many different species and environments. The difficulty comes in knowing how to apply the techniques with different datasets. As pointed out by McNyset and Blackburn (2006), it is crucial to understand the specific habitat model being used, including how errors and uncertainties associated with the model affect model performance (Barry and Elith, 2006).

Drawing on the lessons from Elith et al. (2006) and Barry and Elith (2006), we are currently developing a strategy to evaluate a suite of habitat models for predicting distributions of native and non-indigenous estuarine species. Estuarine environments present additional challenges compared with terrestrial and marine environments. For example, estuaries functionally represent habitat islands, and the larvae of many species are episodically dispersed great distances by currents. In addition, estuarine science does not have the luxury of continuous distributions for key environmental data layers, such as sediment composition and water temperature, that are available in terrestrial and, to a lesser extent, marine environments. Most Pacific coast estuaries are not detectable at a one-degree cell size, as used by Wiley et al. (2003) to predict the distributions of fish in the Atlantic Ocean and Caribbean Sea.

The strategy we are developing to evaluate the performance of habitat-modelling techniques for estuarine species is: (i) initially to model native and non-indigenous species (NIS) with reasonably well-known distributions; (ii) to include species with a variety of different spatial extents; (iii) to challenge the approaches by modelling large spatial extents, including areas outside the known range of the target species, when possible; (iv) to evaluate outcomes over different ecologically relevant spatial scales; and (v) to validate model outputs using independent data from the Northeast Pacific while attempting to minimize the effects of autocorrelation. After evaluating the various habitat models with known species, we will attempt to predict the distributions of new invaders.

Here, we present a preliminary evaluation of a relatively new modelling technique, non-parametric multiplicative regression (NPMR). To that end, we evaluate NPMR using marine/estuarine benthic species with reasonably well-documented distributions at two spatial scales, habitat and estuary. The reason for modelling a suite of species at this stage, rather than conducting a detailed analysis on one or two species, is to evaluate model behaviour across a range of taxonomic and functional groups, and to evaluate the utility of different habitat variables at two spatial scales.


    Methods
 Top
 Introduction
 Methods
 Results and discussion
 Conclusions
 References
 
We modelled species distributions using NPMR as implemented in HyperNiche version 1.20 (McCune and Mefford, 2004). NPMR represents a species’ response surface in multidimensional environment space by smoothing its response in a local area of environmental space through the combination of information from neighbouring observations in environmental space (McCune, 2006). The reported advantages of NPMR are that it: (i) incorporates interactions among multiple ecological variables; (ii) can represent complex species response surfaces; and (iii) controls for overfitting (McCune, 2006). Three additional advantages for estuarine species modelling are that it: (iv) does not require continuous environmental data layers; (v) can incorporate categorical habitat variables; and (vi) incorporates absence as well as presence data. The Gaussian weighting function with a local mean estimator was used in all NPMR modelling. Several different approaches are available to evaluate model performance, and for this analysis, we used the area under the curve (AUC) of the receiver operating characteristic curve (Elith et al., 2006). An AUC value >0.5 indicates that the model is performing better than random in predicting a species' presence/absence. Elith et al. (2006) used a cut-off of AUC >0.75 for models that had "a useful amount of discrimination", and we add the criterion that an AUC >0.90 indicates that the model has high discrimination. Additionally, NPMR uses a "leave-one-out cross validation", so it is possible to estimate omission and commission errors. These errors were calculated using a threshold value of >0 probability of occurrence equalling presence.

Habitat (point) scale
The objective of the habitat-scale modelling was to predict the presence/absence of a species at specific points as defined by individual benthic samples. Benthic samples for the modelling were obtained from US Environmental Protection Agency's (EPA) Western Coastal Environmental Monitoring and Assessment Program (EMAP; see Nelson et al., 2005). We used samples from estuarine surveys in California, Oregon, and Washington in 1999, 2000, 2002, and 2003, as well as samples from the 2002 survey of near-coastal and estuarine sites in south central Alaska and the shelf survey (30–120 m) of Washington in 2003. Therefore, habitat scale includes both estuarine and nearshore sites, whereas the estuary-scale analysis included only estuarine studies as defined below. Most samples were taken with a 0.1-m2 grab and sieved through a 1.0-mm mesh screen. However, because different sample sizes or meshes were used in the San Francisco and 2002 intertidal surveys, the current analysis utilizes presence/absence data rather than abundance data. In all, there were 664 benthic samples and >2500 taxa across all stations.

The 23 most frequent species occurring at ≥50 stations were chosen for modelling. An additional four species with occurrence at between 38 and 49 sites were also modelled to include additional species whose ranges extended into Alaska or the Washington continental shelf. In all, 13 native species, 11 NIS, and 4 cryptogenic species were modelled at a habitat scale. The most frequently occurring species (>100 stations) included native species of amphipod (Americorophium salmonis) and polychaete (Glycinde polygnatha), NIS of amphipod (Grandidierella japonica and Monocorophium acherusicum), a bivalve (Mya arenaria), polychaetes (Polydora cornuta, Pseudopolydora kempi, and Streblospio benedicti), and a cryptogenic species of tanaid (Leptochelia dubia).

Quantitative habitat variables at each station included percentage silt and clay, total organic carbon (TOC) of the sediment, and sample depth. Because the values for overlying salinity were not available for the intertidal sites, we used two categorical salinity classifications. One was the Venice system, consisting of five classes: fresh water, oligohaline, mesohaline, polyhaline, and euhaline. The second system consisted of subdividing the oligohaline, mesohaline, and polyhaline into two classes each, for a total of eight classes. The salinity class for each site was determined from the overlying water sample when available; otherwise, the location of each sampling point was plotted in a GIS system, and the salinity class was estimated by the location's proximity to existing salinity records. Another suite of categorical variables was the presence/absence of four ecological engineering guilds: burrowing shrimp (Neotrypaea californiensis or Upogebia pugettensis), submerged aquatic vegetation (Zostera marina or Zostera japonica), marsh plants, and/or macroalgae. The presence of burrowing shrimp, rooted aquatic plants, or macroalgae can alter the structure of intertidal and shallow-water benthic assemblages through a variety of mechanisms, including the effects on dissolved oxygen, sedimentation, and bioturbation.

Estuary scale
The objective at an estuary scale was to predict the presence/absence of a species within a specific estuary. Data for modelling at an estuary scale were obtained from the Pacific Coast Ecosystem Information System (PCEIS). PCEIS is a regional database of native and non-indigenous invertebrates, plants, and fish found in estuaries on the Pacific coast of the US, with associated landscape and watershed characteristics for each estuary (Lee and Reusser, 2006). The information contained in PCEIS comes from a variety of sources, including historical sampling efforts, museum records, published literature, and ongoing monitoring efforts such as the US EPA's EMAP program. From a set of 180 estuaries with biological information in PCEIS, a subset of 28 was selected, where at least 100 species had been reported, to reduce errors of omission from false negative occurrences. These estuaries were well distributed along the Pacific coast from Grays Harbor, Washington, to the Tijuana River in California, and varied in size from 4 to 14 518 km2.

Species occurrence data for the 28 estuaries were extracted from PCEIS for the 28 species used in the habitat-scale model evaluation. Based on model data predictor guidelines, this set of 28 species was reduced to a subset of 13 species (seven native, five non-indigenous, and one cryptogenic) that had a minimum of 10 species present and 10 species absent in the 28 selected estuaries. A suite of 13 landscape-scale characteristics for each estuary were extracted from PCEIS across four broad categories: geography (biogeographic province and latitude), climate (mean annual air temperature and mean annual precipitation), watershed (land, water, intertidal, subtidal, and riverine area), and geomorphology (ratio of water to land, ratio of subtidal to intertidal, ratio of riverine to estuarine, and ratio of intertidal to estuarine area).

Geographic variables and datasets
Two geographic variables, latitude and biogeographic province, were evaluated in both the habitat and estuary analyses. The biogeographic provinces were based on the study by Croom et al. (1995) that divides the outer coasts of California, Oregon, and Washington into three provinces, and classifies Puget Sound as a fourth. For the habitat analysis, south central Alaska and the Washington shelf samples were considered separate biogeographic provinces, and a categorical variable was added to indicate whether the site was coastal or estuarine. In the habitat analysis, model runs including all sites (n = 664) are referred to as "All sites with geography" or "All sites without geography", depending on whether or not latitude and biogeographic province were included. We could not include quantitative measures of overlying water salinity or temperature in the "all site" scenarios because our overall data included intertidal sites, precluding the measurement of overlying water variables. To evaluate their importance, the subtidal sites (n = 454) were modelled using the site-specific quantitative values for temperature and salinity, as well as the categorical salinity classes mentioned above.


    Results and discussion
 Top
 Introduction
 Methods
 Results and discussion
 Conclusions
 References
 
A prime objective of this preliminary analysis was to determine whether NPMR provided sufficient power in predicting the distributions of native and non-indigenous benthic species to warrant more detailed analyses. For the "All sites with geography" models, the average AUC for all 28 species at a habitat scale was 0.87. The average AUC for the 13 species used in the estuarine-scale analysis was 0.79. Based on the relatively high AUC values for both scales, we conclude that NPMR was sufficiently predictive to continue its evaluation with more detailed analyses.

A related question is whether errors of omission (false negatives) or commission (false positives) were more prevalent. Are the models more likely to predict incorrectly that a species is absent, or are they more likely to predict incorrectly that a species is present? At the habitat scale, the commission error was higher than the omission error for 23 of the 28 species, based on the "All Sites with geography" models. One possible reason for the higher frequency of commission errors was that we were unable to include temperature and salinity in this model, because our overall data included intertidal sites. However, the inclusion of a quantitative measure of salinity and overlying water temperature in the subset of subtidal samples did not change this pattern, and commission errors were still larger in 21 of the 28 species. Another possible cause is the inherent small-scale variability of benthic organisms. In this case, the model would correctly predict that the habitat type is suitable, but the species could be absent in an individual sample because of processes operating locally, such as small-scale variation in recruitment or predation. In future habitat-scale analyses, we will attempt to address this issue by aggregating samples over a larger spatial area.

At an estuary scale, the frequency of the two types of errors was more similar, with the commission error more prevalent than the omission error for 7 of the 13 species, with one species containing no omission or commission errors. These omission errors are real errors—the model predicted that a species would not occur, even though it had been found within the estuary. Conversely, the commission errors could reflect reporting biases, and future sampling of these estuaries may reveal that the species is indeed present.

Another question was whether our model would detect a difference in ability to predict the presence/absence of native species vs. NIS. One possibility was that model performance could be degraded if the NIS had not yet approached an "equilibrium" distribution. Using the models based on the "All sites with geography" scenario at a habitat level, we found no difference in predictive power between native and NIS. The average AUC for the 13 native species was 0.87, and for the 11 NIS, it was 0.86. Similarly, at the estuary scale, the average AUC of both the native and NIS was 0.79. These results indicate that it is possible to use NPMR to evaluate the relationship of well-established NIS with environmental factors at habitat and estuary scales. This is not to be construed as concluding that the distribution of some of these NIS might not expand in the future or that the habitat models will work as well with rarer or recently introduced species.

Our last line of inquiry focused on evaluating the key environmental predictors. One specific question relates to the inclusion of geographic variables. This raises a conundrum. If the goal is to predict a species’ distribution within a proscribed geographic extent, then inclusion of geographic variables may be appropriate. However, if the goal is to predict the "equilibrium" range of a species, then inclusion of geographic variables may overestimate the range for a species undergoing contraction or, conversely, underestimate the range for a species undergoing expansion. For example, predicting the distribution of a recently introduced NIS based on models using geographic variables may substantially underestimate its final range.

To assess the influence of the geographic variables, we compared model performance with and without their inclusion. At an estuary scale, removing the geographic variables had no effect on the average AUC of the 13 species, though there was a decrease of one species in the >0.90 category and an increase of one species in the minimally predictive range (>0.5 and <0.75; Figure 1). It appears that the combination of mean air temperature and mean precipitation of the watershed captures much of the environmental information contained in latitude and/or biogeographic province. Future modelling efforts at an estuary scale will explore whether the inclusion of high-resolution coastal sea surface temperature improves the predictions at this scale.


Figure 1
View larger version (29K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Figure 1. Percentage of species falling into different classes of the AUC in the habitat- and estuary-scale modelling with and without the geographic (w/Geo and wo/Geo, respectively) variables, latitude, and biogeographic provinces. "Habitat w/Geo" are the habitat-scale models with n = 664 samples and the geographic variables included. "Habitat wo/Geo" are the results from the same dataset without the geographic variables. "Estuarine w/Geo" and "Estuarine wo/Geo" are the estuary-scale models (n = 28 estuaries) with and without the geographic variables, respectively. The habitat results were based on 28 species and the estuarine results on 13 species. Numbers above each bar indicate the average neighbourhood size used in the model.

 
At the habitat scale, latitude was the single best predictor for 21 species, and either latitude or biogeographic province was incorporated into the final model for all 28 species. This is not surprising, because sampling station locations ranged from Tijuana to Alaska. Nonetheless, removal of the geographic variables resulted in just a relatively small decline in overall model performance, with average AUC decreasing from 0.87 to 0.83. The more apparent effect of removing geographic variables was that the number of species models that displayed high discriminatory power (AUC >0.90) declined from ten to two, with a concomitant increase in the number of species with an AUC of 0.75–0.90 (Figure 1).

Quantitative values of overlying water salinity and temperature were considered as potentially key variables at a habitat scale. To evaluate their effects, we took the subset of 454 subtidal samples that included these measurements. The results of these subtidal samples mirrored those of the "All Sites with geography" models, with an average AUC of 0.87 when the geographic variables were included. Therefore, inclusion of these variables did not have a discernible effect on average model performance when the geographic variables were included, and latitude was still the single most predictive variable for 14 of the 28 species in the subtidal subset. Excluding the geographic variables reduced the average AUC of the subtidal subset slightly (0.85). When the geographic variables were excluded, water temperature was the single most important variable for 16 species, and the quantitative measure of salinity was the most important for seven species. Although never the most important single variable, the categorical measurements of salinity were included in the final models approximately as often as the quantitative salinity measurements. The results suggest that, although latitude remained the single best predictor, combinations of variables including water temperature and salinity can generate comparable models. The results also suggest that, although quantitative salinity measurements are preferred, the use of salinity classes is a reasonable substitute when quantitative measurements are not available.


    Conclusions
 Top
 Introduction
 Methods
 Results and discussion
 Conclusions
 References
 
This preliminary analysis has suggested that NPMR can predict the distributions of many native and non-indigenous benthic species with a reasonable degree of accuracy at both habitat and estuary scales. However, more detailed analysis with NPMR is required along with comparisons with other modelling approaches to conclude whether it is the "best" approach to predicting the potential distributions of newly introduced NIS. The preliminary analysis also generated insights into the types of habitat variables that can be used in such predictions. Geographic variables were the strongest single predictors at both scales, although combinations of watershed- and estuarine-landscape characteristics (estuary scale) or site-specific quantitative or categorical habitat variables (habitat scale) could be used to generate models of similar predictive power.


    Acknowledgements
 
DAR was partially funded through AMI/GEOSS IAG # DW-14-92231501-0 from the US Environmental Protection Agency. The publication was subjected to review by the National Health and Environmental Effects Research Laboratory's Western Ecology Division and the USGS, and is approved for publication. However, approval does not signify that the contents reflect the views of the US EPA or the USGS. The use of trade, firm, or corporation names in this publication is for the information and convenience of the reader; such use does not constitute official endorsement or approval by the US Department of Interior, the US Geological Survey, or the US Environmental Protection Agency of any product or service to the exclusion of others that may be suitable.


    References
 Top
 Introduction
 Methods
 Results and discussion
 Conclusions
 References
 

    Barry S., Elith J. Error and uncertainty in habitat models. Journal of Applied Ecology (2006) 43:413–423.[CrossRef][Web of Science]

    Croom M., Wolotira R., Henwood W. Marine Region 15: Northeast Pacific. In: A Global Representative System of Marine Protected Areas, Vol. IV, The Great Barrier Reef Marine Park Authority—Kelleher G., Bleakley C., Wells S., eds. (1995) Washington DC: The World Bank and IUCN. 55–106.

    Elith J., Graham C. H., Anderson R. P., Dudik M., Ferrier S., Guisan A., Hijmans R. J., et al. Novel methods improve prediction of species’ distributions from occurrence data. Ecography (2006) 29:129–151.

    Herborg L., Jerde C. L., Lodge D. M., Ruiz G. M., MacIsaac H. Predicting invasion risk using measures of introduction effort and environmental niche models. Ecological Applications (2007) 17:663–674.[CrossRef][Medline]

    Lee H., Reusser D. A. The Pacific Coast Ecosystem Information System (PCEIS). US EPA and US Geological Survey Access Database version 1.0 (2006).

    McCune B. Non-parametric habitat models with automatic interactions. Journal of Vegetation Science (2006) 17:819–830.[CrossRef][Web of Science]

    McCune B., Mefford M. J. HyperNiche. In: Nonparametric Multiplicative Habitat Modeling. Version 1.19 (2004) Gleneden Beach, OR: MjM Software.

    McNyset K. M., Blackburn J. K. Does GARP really fail miserably? A response to Stockman et al. (2006). Diversity and Distributions (2006) 12:782–786.[CrossRef][Web of Science]

    Nelson W. G., Lee H. II, Lamberson J. O., Engle V., Harwell L., Smith L. M. Condition of estuaries of the western United States for 1999: a Statistical Summary. EPA Report 620/R-04/200. US EPA, Office of Research and Development (2005).

    Peterson A. T. Predicting species’ geographic distributions based on ecological niche modeling. The Condor (2001) 103:599–605.[CrossRef]

    Peterson A. T., Vieglais D. A. Predicting species invasion using ecological niche modeling: new approaches from bioinformatics attack a pressing problem. BioScience (2001) 51:363–371.[CrossRef][Web of Science]

    Wiley E. O., McNyset K. M., Peterson A. T., Robins C. R., Stewart A. M. Niche modeling and geographic range predictions in the marine environment using a machine-learning algorithm. Oceanography (2003) 16:120–127.


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?



This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (PDF) Freely available
Right arrow All Versions of this Article:
65/5/742    most recent
fsn021v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Reusser, D. A.
Right arrow Articles by Lee, H.
Right arrow Search for Related Content
PubMed
Right arrow Articles by Reusser, D. A.
Right arrow Articles by Lee, H., II
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?