Validating Models of Complex Socio-Ecological Systems in the Mediterranean Using ‘Digital Proxies’

Authors: C. Michael Barton, Isaac IT Ullah, Nicolas Gauthier, Nari Miller, Grant Snitker, Irene Esteban, Joan Bernabeu, and Arjun Heimsath

Validation is key to assessing the usefulness of a model. Ironically, the richness and sophistication of spatially-explicit agent based or cellular models of past socio-ecological systems makes them particularly difficult to validate. This is because such models commonly produce results in the form of high-resolution, multidimensional, digital landscapes that can include locations of simulated land-use, human settlement, vegetation communities, soils, topography, and more. But we cannot directly observe past human and natural system processes to validate these models, and instead must rely on proxy data to infer the processes represented. Proxies for past socio-ecological systems include samples of sediments and soils, plant micro- and maco-fossils, discarded artifacts, and cosmogenic radionuclides. Such proxy records tend to be incomplete, sparsely preserved, collected from a limited number of points on the landscape, and often at multiple depths below the surface. Even converted to digital form, the great differences between empirical proxies and model results make direct comparisons impossible, creating important challenges for model validation.

To address the incommensurability between our models and the empirical data available to validate them, the Mediterranean Landscape Dynamics Project (MedLanD) has developed a validation instrument that creates a ‘digital proxy’ record based on model results. The digital proxy is analogous to extracting a digital core at specified points in the gridded, digital landscape. It simulates the accumulation over time of a proxy-like record for modeled human land-use, vegetation, landscape fire, and surface processes. Digital proxy ‘cores’ can be extracted from any point in the simulated world and compared directly with empirical samples taken from analogous points in real world landscapes, improving our ability to validate complex models. We present a brief overview of our digital proxy modeling method and provide a test case of comparing digital and empirical data from locales in Mediterranean Spain.

An earlier version of this paper was presented at the European Archaeological Association meetings in Barcelona, Spain (7 September 2018)

The MedLanD model is available at

Impressive model! The idea to generate data that mimics the way actual data is collected will generate another problem with model validation. Namely the uncertainties about the way materials are end up in the samples may depend on al kind of assumptions.
It reminds me of the work @miguelpais is doing on testing the different protocols to collect data on fish counts. Since data is not independent on various assumptions, you may derive a more robust (or more conservative?) understanding of the model dynamics given uncertainties in the model and the data collection protocol. I also like to poke @cwren and @soestmo since their work also involved the comparison of model results with archaeological data.

In this case, we are trying to simulate the processes by which materials are deposited and end up in the samples. An important part of the processes we are interested in are in fact how and why materials end up where they do. While there may be some level of equifinality, we are hopeful that if we can produce digital cores that are similar to real ones, we may be coming close to simulating the processes that move and deposit things on landscapes. At the least, this is closer than comparing our models to a narrative interpretation of proxy data based on reasonable but untested assumptions about depositional processes.

The “digital proxy” idea is interesting, it reminds me of a - never published - model of macrozoobenthos in a tidal flat ecosystem, where also the only historical information we had were cores with different layers of clams, mussels, etc.

Overall, “digital proxy” does what the approaches “pattern-oriented modelling” and “virtual ecologist” recommend: if you have data that contain valuable information about the past, i.e. a pattern, make sure to have variables and processes in the model so that then, as a “virtual ecologist/researcher”, you can sample exactly the same kind of data in the model.

I take the fact that you independently developed very similar ideas as a strong argument that these approaches make sense and are in fact needed. Thank you!

Pattern-oriented modelling: Grimm, V., & Railsback, S. F. (2012). Pattern-oriented modelling: a ‘multi-scope’for predictive systems ecology. Phil. Trans. R. Soc. B 367: 298-310.

Virtual ecologist: Zurell D, Berger U, Cabral JS, Jeltsch F, Meynard CN, Münkemüller T, Nehrbass N, Pagel J, Reineking B, Schröder B, Grimm V (2010) The virtual ecologist approach: simulating data and observers. Oikos 119: 622-635.

Thanks Volker. We indeed consider this a particular form of more general pattern oriented modeling, which we also consider an example of Bankes et al 2002 model as experiment approach–often citing both in our published work.

Bankes, S.C., Lempert, R., Popper, S., 2002. Making Computational Social Science Effective: Epistemology, Methodology, and Technology. Social Science Computer Review 20, 377–388.