The Repast team provides their perspectives on where agent-based modeling is right now, what it’s enabling them to do, and where they think the field is headed.
Topics discussed include collaborative model building, encapsulation of complex agent processes, incorporating machine learning into ABM, geographically contextualized ABMs, applications of ABM in crises, and how high-performance computing is transforming ABM. They also discuss how with greater computational power and capabilities there are greater responsibilities for ABM practitioners to develop trusted models that can be applied effectively across domains and how the Extreme-scale Model Exploration with Swift (the EMEWS framework) enables this.
Great talk! Given the facilities you have it is no surprise you focus on high performance computing. Do you know how the results and insights change of going from a modest number of agents to a 3 million agent model of Chicago, for example? In general how much do we know from the different insights we get in having more details in the model? Another use of more computational power would be to keep the models relatively simple, if we could demonstrate what the implications are of different levels of detail, is to do more sophisticated sensitivity and uncertainty analysis. A reason for more analysis is that a lot of the inputs to the model (data and mechanisms) are often surrounded with a lot of uncertainty.
Thanks for watching Marco! And thank you for organizing this unique virtual conference!
Regarding your two points, the first about the need for larger models and the second is what we’ve been referring to as model exploration, let me respond to them in order.
First, I’m convinced that there are many systems that can be fruitfully modeled with simple ABMs, so I want to make sure that people don’t come away from our presentation thinking that we are advocating for large/complex models for all purposes. One reason why we like to have the ability to model many agents is because we often have access to very detailed data about the system under study. The data can be about the agent environment, e.g., the types of places that our agents visit, agent behaviors, or agent attributes. Being able to incorporate this complexity becomes important when the goal of the investigation is to develop realistic interventions on the system, rather than providing more general insights.
Second, even a relatively simple model can benefit from model exploration, especially when uncertainties in the input parameters are more than nominal. In fact, the type of large-scale model exploration that we do with EMEWS can help a model developer (usually under a fixed time/effort budget) to better understand where effort could be most valuably expended for tightening parameter uncertainty bounds. As a note, we also apply EMEWS to non-agent-based methods because, as it turns out, other “black box” methods (e.g., microsimulation, machine/deep learning) can also greatly benefit from better characterization of their model behaviors.
I think you did a great job of articulating both the importance, and the benefits, of allowing modelers to more fully explore the parameter-space of their models; this was the key motivator behind my own work with BehaviorSearch (http://behaviorsearch.org/) as well. You mentioned the expertise-mismatch, wherein expert modelers may not have sufficient expertise with parameter exploration algorithms or high-performance computing in order to effectively answer the questions that they would like to answer.
I think we are both trying to work on these problems, but that we both have a long way to go. I like to think BehaviorSearch made it substantially easier for modelers to understand the process of using metaheuristic search algorithms on their parameter spaces and to run such searches, but there’s still a fair amount of jargon, and I suspect the user experience could still be made simpler and smoother. More importantly, BehaviorSearch doesn’t easily scale to HPC, without writing custom scripts for deploying multiple searches in parallel and combining that data, and many parameter searches are not feasible without HPC.
EMEWS addresses the HPC problem directly, but based on my perusal of the EMEWS tutorial, I feel like it’s still going to be quite challenging for modelers who don’t have a strong CS or IT background to apply these tools to their own models. Do you have thoughts on future directions, for how to even further lower the threshold for use of these tools? Possibly involving cloud resources, which are more generally available to researchers world-wide?
Thanks Forrest! That’s a great question and one that we have recognized as critical for us to truly democratize the use of HPC resources via EMEWS. We have some ongoing efforts to lower the barriers to entry and we’re applying, for lack of a better phrase, a pattern oriented approach, where through our work with our collaborators we are getting an understanding of what the important usage patterns are and where the particular pain points exist within those usage patterns.
One set of additions to the EMEWS ecosystem that we hope to be able develop are higher-order templates that would encapsulate some of the more generic usage patterns and enable a user to provide a declarative specification indicating, e.g., model location, metaheuristic algorithm parameters, and have EMEWS handle the rest of the workflow composition. The user would still be able to take full advantage of the flexibility of EMEWS and go into the workflow components to make any additional adjustments if they are desired, but this would, in a sense, allow for the HPC components of the technology to fade into the background and only expose the more familiar aspects to the non-HPC researcher.
Another area of work is indeed related to making cloud-based resources more easily usable with EMEWS. While it is currently possible to provision cloud-based MPI clusters and run EMEWS workflows on them, what we see is a future where the analysis, implemented through an EMEWS workflow, takes center stage and can be run on-demand by a number of different stakeholders with interest in the analysis outputs. This kind of capability would bring computational science closer to its promise of providing decision makers with the tools they need to make the best evidence-based decisions.