Effects of sample size on the performance of species distribution models

A wide range of modelling algorithms is used by ecologists, conservation practitioners, and others to predict species ranges from point locality data. Unfortunately, the amount of data available is limited for many taxa and regions, making it essential to quantify the sensitivity of these algorithms...

Full description

Bibliographic Details
Main Authors:	Wisz, M.S., Hijmans, R.J., Li, J., Peterson, A.T., Graham, C.H., Guisan, A.
Format:	Journal Article
Language:	Inglés
Published:	Wiley 2008
Online Access:	https://hdl.handle.net/10568/166308

_version_	1855514582197469184
author	Wisz, M.S. Hijmans, R.J. Li, J. Peterson, A.T. Graham, C.H. Guisan, A.
author_browse	Graham, C.H. Guisan, A. Hijmans, R.J. Li, J. Peterson, A.T. Wisz, M.S.
author_facet	Wisz, M.S. Hijmans, R.J. Li, J. Peterson, A.T. Graham, C.H. Guisan, A.
author_sort	Wisz, M.S.
collection	Repository of Agricultural Research Outputs (CGSpace)
description	A wide range of modelling algorithms is used by ecologists, conservation practitioners, and others to predict species ranges from point locality data. Unfortunately, the amount of data available is limited for many taxa and regions, making it essential to quantify the sensitivity of these algorithms to sample size. This is the first study to address this need by rigorously evaluating a broad suite of algorithms with independent presence–absence data from multiple species and regions. We evaluated predictions from 12 algorithms for 46 species (from six different regions of the world) at three sample sizes (100, 30, and 10 records). We used data from natural history collections to run the models, and evaluated the quality of model predictions with area under the receiver operating characteristic curve (AUC). With decreasing sample size, model accuracy decreased and variability increased across species and between models. Novel modelling methods that incorporate both interactions between predictor variables and complex response shapes (i.e. GBM, MARS‐INT, BRUTO) performed better than most methods at large sample sizes but not at the smallest sample sizes. Other algorithms were much less sensitive to sample size, including an algorithm based on maximum entropy (MAXENT) that had among the best predictive power across all sample sizes. Relative to other algorithms, a distance metric algorithm (DOMAIN) and a genetic algorithm (OM‐GARP) had intermediate performance at the largest sample size and among the best performance at the lowest sample size. No algorithm predicted consistently well with small sample size (n < 30) and this should encourage highly conservative use of predictions based on small sample size and restrict their use to exploratory modelling.
format	Journal Article
id	CGSpace166308
institution	CGIAR Consortium
language	Inglés
publishDate	2008
publishDateRange	2008
publishDateSort	2008
publisher	Wiley
publisherStr	Wiley
record_format	dspace
spelling	CGSpace1663082025-05-14T10:39:28Z Effects of sample size on the performance of species distribution models Wisz, M.S. Hijmans, R.J. Li, J. Peterson, A.T. Graham, C.H. Guisan, A. A wide range of modelling algorithms is used by ecologists, conservation practitioners, and others to predict species ranges from point locality data. Unfortunately, the amount of data available is limited for many taxa and regions, making it essential to quantify the sensitivity of these algorithms to sample size. This is the first study to address this need by rigorously evaluating a broad suite of algorithms with independent presence–absence data from multiple species and regions. We evaluated predictions from 12 algorithms for 46 species (from six different regions of the world) at three sample sizes (100, 30, and 10 records). We used data from natural history collections to run the models, and evaluated the quality of model predictions with area under the receiver operating characteristic curve (AUC). With decreasing sample size, model accuracy decreased and variability increased across species and between models. Novel modelling methods that incorporate both interactions between predictor variables and complex response shapes (i.e. GBM, MARS‐INT, BRUTO) performed better than most methods at large sample sizes but not at the smallest sample sizes. Other algorithms were much less sensitive to sample size, including an algorithm based on maximum entropy (MAXENT) that had among the best predictive power across all sample sizes. Relative to other algorithms, a distance metric algorithm (DOMAIN) and a genetic algorithm (OM‐GARP) had intermediate performance at the largest sample size and among the best performance at the lowest sample size. No algorithm predicted consistently well with small sample size (n < 30) and this should encourage highly conservative use of predictions based on small sample size and restrict their use to exploratory modelling. 2008-09 2024-12-19T12:56:07Z 2024-12-19T12:56:07Z Journal Article https://hdl.handle.net/10568/166308 en Wiley Wisz, M. S.; Hijmans, R. J.; Li, J.; Peterson, A. T.; Graham, C. H.; Guisan, A. and. 2008. Effects of sample size on the performance of species distribution models. Diversity and Distributions, Volume 14 no. 5 p. 763-773
spellingShingle	Wisz, M.S. Hijmans, R.J. Li, J. Peterson, A.T. Graham, C.H. Guisan, A. Effects of sample size on the performance of species distribution models
title	Effects of sample size on the performance of species distribution models
title_full	Effects of sample size on the performance of species distribution models
title_fullStr	Effects of sample size on the performance of species distribution models
title_full_unstemmed	Effects of sample size on the performance of species distribution models
title_short	Effects of sample size on the performance of species distribution models
title_sort	effects of sample size on the performance of species distribution models
url	https://hdl.handle.net/10568/166308
work_keys_str_mv	AT wiszms effectsofsamplesizeontheperformanceofspeciesdistributionmodels AT hijmansrj effectsofsamplesizeontheperformanceofspeciesdistributionmodels AT lij effectsofsamplesizeontheperformanceofspeciesdistributionmodels AT petersonat effectsofsamplesizeontheperformanceofspeciesdistributionmodels AT grahamch effectsofsamplesizeontheperformanceofspeciesdistributionmodels AT guisana effectsofsamplesizeontheperformanceofspeciesdistributionmodels

Effects of sample size on the performance of species distribution models

Similar Items