Go to content

On this page:

Modeling provides information on changes in the state of nature and supports conservation planning

Various models describing the state of nature and diversity produce information for planning the protection and management of species and habitats, as well as supporting decision-making. The models describe the functioning of habitats and supplement individual observations. Models also help to show variations of species populations, habitat types and the state of nature in time and space.

Models can be used to monitor and predict the effects of land use, climate change and other human-caused environmental changes on diversity and the functioning of ecosystems. Information about the functionality and reliability of the models is obtained by comparing the results given by the models with the observations. The uncertainty of the models can be reduced by collecting more data in the field and using that to develop a better model.

What are the models used for?

The models can be used to produce both basic and applied information about the functioning, structure and internal cause-and-effect chains of ecosystems and species communities, as well as the environmental factors that control the development, spread and survival of species individuals and populations. Human activity often has a significant impact on biodiversity. This effect can be visualized with the help of models; the predictions of the models can be used as an aid in social planning and decision-making.

The regional assessments of the nature effects of land use and the allocation of protection measures to nature areas that are significant in terms of species and habitat types or sensitive to adverse effects are based on the available information. Since not all areas can be mapped and not all characteristics of habitats can be measured, models are used as support. Models help to fill the gaps in nature and ecosystem knowledge, to identify connections between different factors and to assess the direction and intensity of future changes. Models are based on existing knowledge, understanding and measurements of the subject under consideration.

Models can be used to study for example the connections between ecosystems, species communities and climate and related processes. They can be used to predict future changes in various natural features for example how the changing environment and land use decisions made now will affect biodiversity. Modeling provides one additional means of monitoring ecosystems and species populations.

Treetops pictured from below
Machine learning models can produce detailed predictions for example of the occurrence of individuals of large key tree species. Photo: Riku Lumiaro / Syke

A wide range of modeling methods

There is a wide range of approaches and techniques used in modeling species, habitats, and other aspects of biodiversity. Modeling methods can be classified in different ways; one classification is to divide them into statistical and process-based models. The wide spectrum of models is also indicated by the fact that in one of the most recent comparative studies (Nordberg et al. 2019) more than 30 modeling methods of species occurrence areas based on statistical, or machine learning methods have been included.

Statistical models and machine learning models

There are dozens of uses of statistical models in ecology and conservation biology. The models help, for example, to understand how the occurrence of valuable natural features depends on environmental factors and how different land use methods and other human measures that shape nature affect them. Compilation of this type of model is based on information on where the natural feature in question occurs and where it does not, as well as information about environmental variables measured (or otherwise assessable) from these places.

Statistical models are ‘static’, mathematical equations describing correlations between variables. They tell about how the occurrence or variation of a phenomenon or variable is linked to the variation of other biotic or abiotic factors.

In the models, the environmental variables act as factors explaining the variation in the response variable, such as the presence or abundance of rare species or other natural features. A wide variety of variables can be used as explanatory environmental factors, such as variables describing the macroclimate or the local climate, the structure, coverage and quality characteristics of the vegetation of the place of occurrence, the moisture and nutrient properties of the soil or the land use of the surrounding landscape (e.g. Ilmonen et al. 2009; Pecchi et al. 2019; Saarimaa et al. 2019 ; Virkkala et al. 2022).

It should be noted that statistical models based on regression analyzes tell about the correlations between the response variable and the explanatory factors, and they do not necessarily describe cause-effect-type causality relationships (Jarnevich et al. 2015). This caution is also essential for models other than regression models. Models should be evaluated carefully, especially in the study of poorly known species and also when the available environmental variables have been measured approximately or they are only indirectly or distally connected to ecological variables central to the species (so-called ‘distal’ or ‘surrogate predictors’; Hof et al. 2012; Mod et al. . 2016; Gardner et al. 2019).

Often, previous research or a general understanding of the ecological foundations of species and habitat types provides a sufficient basis for statistical modeling of natural features. With models based on ecologically important environmental variables, it is thus possible to predict places favorable for the occurrence of conservationally significant natural features in poorly known areas, as well as how changes in climate and land use affect their occurrence.

One significant use of statistical models and the data they produce in Finland has been to predict the locations of sites valuable in terms of forest and wetland conservation (e.g. Parviainen et al. 2008; Saarimaa et al. 2019; Björklund et al. 2020; Forsius et al. 2021; Virkkala et al. 2022; Kujala et al. 2023). Statistical diversity models can also be used to target field surveys to the most promising new locations (Roden et al. 2017; Rosner-Katz et al. 2020).

Models used in ecological research

Dynamic species population models and process-based ecosystem models

Modelling uncertainty

Since both GLM and GAM models and machine learning methods can produce quite complicated, possibly overparameterized biodiversity models, their use requires caution. First, if the material available for modeling contains parts or variables of uncertain quality, uncertain predictions can be produced. Second, over-parameterized models or models with uncertain response ratios often work accurately within the ‘ecological state’ of the model’s calibration area, but at the edges and especially outside of it, the reliability of the predictions produced by the model may decrease (Heikkinen et al. 2012; Yates et al. 2018).

When predicting favorable locations for species in unmapped areas (Virkkala et al. 2022) and in various land use situations (Luoto et al. 2007; Seedre et al. 2018) and in a changing climate (Virkkala et al. 2008, 2014; Eskildsen et al. 2013), there are other potential uncertainties in the use of models. These include, among others

In statistical models of species occurrence areas it is important to consider whether the species data includes records of both the presence of the species and the fact that it has not been observed, or whether it consists only of ‘positive’ observations (so-called ‘presence/absence models’ vs. ‘presence-only models’; Elith et al. 2020). Observation data collected by natural history museums and data based on citizen science often contain only positive species observations. The method often used in the modeling of this type of data is Maxent, which was developed specifically to take into account the special features of ‘presence-only’ data (Merow et al. 2013). In the application of other modeling methods, the lack of occurrence data can be taken into account by various means such as generating artificial species-not-observed data (‘pseudoabsences’; Barbet-Massin et al. 2012) in the modeling data according to careful consideration.

Modeling occurrences of rare species often presents special challenges. In their modeling, treating the species as a single community to be modeled can be a significant help. These types of models are known as joint species distribution models, where the method combines different species into interdependent modeling response variables.

Field scientists taking soil samples
Field observations are needed to develop the models and to minimize their uncertainties. Photo: Riku Lumiaro / Syke

A model is a simplified description of reality

Models are simplified descriptions of the real world. By comparing the results of the model with the observations, we get information about the reliability of the model. The uncertainty of models can be reduced with the help of measurement data and model development.

Uncertainty is part of natural scientific measurements and modeling, which must be taken into account when interpreting the results of models and in decision-making based on modelling. Uncertainty means in particular that the magnitude of the investigated variable is not known precisely but only with a range or probability, but also that the processes and variables related to the investigated phenomenon have not been identified well enough.

If the uncertainty of models is not recognized, wrong assumptions can be made about the consequences of decisions based on modeling. When the uncertainties of the models are carefully taken into account, a basis is created to identify the measures that are most likely to lead to the goals set for example for conservation planning and nature management.

The uncertainty of the input data or the initial state of models that examine the functioning of ecosystems can be evaluated with the help of measurements. Since measurement data is usually only available from some places and at some moments, the measurement data must be generalized to improve the spatial and temporal coverage of the data. The measurements can also be used to calibrate models and determine their structural uncertainty.

However, uncertainty can be caused by things that are not known or cannot be measured, such as future development. In such cases, we can use scenario analyses, which allow us to outline and compare alternative development paths depending on different assumptions. By simulating the development of forests several times in a row with randomly varying initial data, we can also identify solutions that always lead to the same final result, regardless of the uncertainty of the model and its initial data. For example, we can identify areas whose nature values are most likely to be preserved in the future, regardless of modeling uncertainty.

The inclusion of uncertainties in the research results indicates the quality and usability of the research as a support for decision-making. In addition, it helps to understand the limitations of the research. It is not always possible to reduce the uncertainty associated with model estimates. Then it is important to consider how big and what kind of risk we are ready to take. In this case, applying the precautionary principle and choosing the least risky option may make the most sense.