Cubification of Biodiversity Data: FAIRiCUBE and the European Habitat Classification System

Susanna Ioni,Kryštof Chytrý,Kathi Schleidt,Stefan Jetschny,Heimo Rainer

Published 2025 in Biodiversity Information Science and Standards

ABSTRACT

European habitats are classified under a framework developed by the European Topic Centre for Biodiversity for the European Environment Agency, as part of the European Nature Information System (EUNIS) (Davies et al. 2004). All terrestrial, freshwater, and marine habitats follow a hierarchical classification based on physical features, human influence, and dominant vegetation (Moss 2008, Chytrý et al. 2020). Distribution maps are provided and modelled using occurrence data of indicator species collected from vegetation surveys (Hennekens 2017). Although the system may seem accurate, when we first plotted the distribution of the main species of our habitat study case, EUNIS Habitat S22 ‘Alpine and subalpine ericoid heath’ (European Environment Agency 2019), we observed that occurrence data, e.g., from sources like the Global Biodiversity Information Facility (GBIF), often fell outside the mapped areas of the habitat. Furthermore, important occurrence data sources, such as herbaria, were left out of the official distribution mapping, representing, in our view, a significant shortcoming of the EUNIS system. This study addresses these gaps by integrating diverse sources of in situ occurrence data (herbaria, vegetation surveys, citizen science) through a machine learning approach to complement the current EUNIS mapping. Specifically, we modelled the distributions of diagnostic species of the Habitat S22, using species distribution models (SDMs). For this purpose, we retrieved occurrence data from GBIF, identified by the accepted names as well as taxonomic synonyms, using the R package rgbif (Chamberlain et al. 2025), and utilised the Darwin Core (Wieczorek et al. 2012) standard. Data were filtered to include European occurrences with spatial coordinates and uncertainty of <500 m, and only spring and summer months of 1980–2024. For modelling itself, they were stratified into a 1-km grid. As SDM predictors, we used proxies for macroclimate and topography. Climatic predictors included CHELSA Bioclim variables of mean annual temperature, temperature seasonality, annual precipitation, precipitation seasonality, and an aridity index (Zomer et al. 2022). For topography, we used the digital terrain model, Copernicus, and calculated slope and indices for heat load (McCune and Keon 2002), topographical ruggedness (Riley et al. 1999), and topographical wetness (Beven and Kirkby 1979), using the spatialEco R package (Evans and Murphy 2021) and SAGA GIS (Conrad et al. 2015). Data were integrated into data cubes, and correlations among species occurrences and predictors were tested. We supplemented the occurrence data with pseudo-absences sampled within a buffer around presence points (Fallgatter et al. 2025). We fitted ensemble SDMs weighted by true-skill statistics scores based on independent cross-validation. We modelled two spatial resolutions in two regions: continental Europe at 1-km resolution, and the European Alps at 100-m resolution. continental Europe at 1-km resolution, and the European Alps at 100-m resolution. Predicted species distributions were aggregated into cumulative distribution maps. Those were further validated by overlapping them with the distribution of the habitat based on vegetation plots classified by an expert system as provided by the European Vegetation Archive (EVA) plots at 1-km resolution. Predictions were also compared with the official EUNIS probability map for Habitat S22. Correlation analyses confirmed the ecological features of the Habitat S22 indicated by the EUNIS classification. Our modelled ranges largely overlapped with the distribution of EVA plots and the EUNIS probability map, but also revealed mismatches at lower elevations and in the Scandinavian region. These differences decreased when fewer species were combined in cumulative predictions. Our findings show that SDMs based on occurrence data from different sources can validate and refine expert-defined habitat maps, offering a complementary and data-driven approach.

PUBLICATION RECORD

CITATION MAP

EXTRACTION MAP

CLAIMS

  • No claims are published for this paper.

CONCEPTS

  • No concepts are published for this paper.

CITED BY

  • No citing papers are available for this paper.

Showing 0-0 of 0 citing papers · Page 1 of 1