course-guide/sim-param-exploration.qmd at main · CoDArchLab-ABM/course-guide · GitHub

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
# Exploring parameter space {#sim-param-exploration}

In the previous chapter, we used BehaviorSpace to run structured simulation experiments. In this chapter, we step back and ask a broader methodological question: **how should we explore a model’s parameter space, and why?**

Agent-based models such as *Artificial Anasazi* typically contain multiple parameters that are:

* uncertain or weakly constrained by archaeological data,
* interacting in non-linear ways,
* capable of producing qualitatively different outcomes.

Parameter exploration is therefore not about “finding the right value”, but about **understanding the space of plausible behaviours** the model can generate.

## What is parameter space?

Each parameter in a model defines one dimension of a multidimensional space. A single simulation run corresponds to **one point** in that space.

In the Artificial Anasazi model, parameters such as:

* harvest-related parameters
* fertility or death-related parameters
* maize productivity or environmental constraints (data input)

together define a parameter space in which different combinations may lead to:

* long-term persistence,
* oscillating population sizes,
* rapid collapse and abandonment.

Even with only a handful of parameters, the number of possible combinations grows very quickly.

::: {.callout-note}

## Artificial Anasazi example

If we explore only three parameters, each with 10 possible values, this already produces
**10 × 10 × 10 = 1,000 configurations**, before accounting for repetitions.
:::

## Why not explore everything?

An exhaustive exploration of parameter space is almost never feasible:

* the number of combinations explodes combinatorially,
* many combinations are implausible,
* simulation time and data storage become limiting factors.

More importantly, **exhaustive exploration is rarely conceptually useful**. The goal is not to catalogue all possible outcomes, but to identify:

* sensitive parameters,
* thresholds and regime shifts,
* regions of stability and instability.

This motivates different **sampling strategies**.

## Structured parameter exploration (grids and sweeps)

BehaviorSpace is particularly well suited for **structured exploration**, where parameters are varied systematically across fixed values or ranges.

Typical strategies include:

* varying one parameter while keeping others fixed,
* exploring small grids of two or three parameters,
* increasing resolution around interesting regions.

In the Artificial Anasazi model, this might involve:

* sweeping `harvest-variance` from low to high values,
* comparing population trajectories across these levels,
* identifying values beyond which collapse becomes common.

::: {.callout-note}

## Artificial Anasazi example

A one-dimensional sweep of `harvest-variance` allows us to ask:
*How much environmental unpredictability can the system tolerate before settlement patterns break down?*
:::

Structured exploration is intuitive and easy to interpret, but it scales poorly as the number of parameters increases.

## Alternative designs or input data

Some times existing alternative design or "competing" versions of the model can be worth exploring. for this we can use the mediation of nominal parameters, which in NetLogo would typically reside in the interface as switches and choosers widgets. This is not available in the Artificial Anasazi, but the chooser `simulation-period` in the Mesara Trade could be considered in this way.

## Random and quasi-random sampling

When the parameter space becomes too large for structured sweeps, an alternative approach is to **sample** it.

Instead of testing all combinations, we:

* define plausible ranges for each parameter,
* draw parameter values randomly (or quasi-randomly),
* run the model for each sampled configuration.

This approach treats the model as a **generator of outcomes**, which we probe statistically.

Random sampling is particularly useful when:

* interactions between parameters are expected,
* we are interested in global patterns rather than fine-grained tuning,
* we plan to analyse results in R.

::: {.callout-note}

## Artificial Anasazi example

Rather than varying only `harvest-variance`, we might simultaneously sample:

* `harvest-adjustment` (the fraction of the harvest that is actually usable)
* fertility-related parameters
* mortality thresholds

to ask which combinations tend to produce persistence versus collapse.
:::

## Defining plausible parameter ranges

Before sampling, parameter ranges must be defined carefully.

For archaeological models, this step is *theoretical*, not technical. Ranges should reflect:

* archaeological estimates,
* ethnographic analogies,
* exploratory uncertainty rather than extremes.

For example, in the Artificial Anasazi model:

* `harvest-variance` might reasonably vary between low and moderate unpredictability,
* demographic parameters should avoid biologically implausible values,
* environmental productivity should reflect the known limits of maize agriculture.

::: {.callout-warning}

## Important

Wide parameter ranges increase coverage, but also increase the risk of generating meaningless or misleading outcomes.
:::

## Systematic random sampling in R

Once parameter ranges are defined, R can be used to generate a **design of experiments** that is then passed to NetLogo.

A simple random sampling workflow in R involves:

1. defining parameter ranges,
2. drawing random samples,
3. exporting them as a table,
4. running NetLogo for each row.

### Conceptual example (R)

```r
set.seed(123)

n <- 200  # number of sampled configurations

params <- data.frame(
  harvest_variance = runif(n, min = 0.2, max = 0.8),
  fertility_rate   = runif(n, min = 0.02, max = 0.06),
  death_age        = sample(35:45, n, replace = TRUE)
)
```

Each row of this table represents **one plausible Anasazi world**.

These configurations can then be:

* read from file during BehaviorSpace setups,
* read from a file in a specific procedure in NetLogo, or
* looped over using RNetLogo or nlrx, to execute simulations from R.

### Interpreting sampled results

Unlike structured sweeps, random sampling does not produce neatly ordered results. Interpretation therefore, relies on:

* visualisation (scatter plots, density plots),
* summary statistics,
* comparisons between outcome categories (e.g. persistence vs. collapse).

For the Artificial Anasazi model, typical analyses might include:

* relating the final population size to sampled parameters,
* identifying parameter regions associated with early abandonment,
* comparing distributions of outcomes rather than single trajectories.

::: {.callout-note}

## Artificial Anasazi example

Instead of asking *“What happens when harvest variability is 0.6?”*,
we ask *“Under which combinations of parameters does collapse become likely?”*
:::

### Complementarity with BehaviorSpace

BehaviorSpace and R-based sampling are not competing approaches.

* **BehaviorSpace** excels at structured, transparent experiments.
* **R-based sampling** excels at large-scale, flexible exploration.

In practice, a common workflow is:

1. structured exploration to build intuition,
2. random sampling to explore interactions,
3. refined sweeps around interesting regions.

This iterative process reflects how agent-based models are used as **theoretical laboratories**, not predictive machines.

## Exercises

::: {.callout-tip}

## Exercise 1: Conceptual

List three parameters of the Artificial Anasazi model.

* Which are well-constrained archaeologically?
* Which are highly uncertain?
* Which would you prioritise for exploration, and why?

:::

::: {.callout-tip}

## Exercise 2: Design

Define plausible ranges for two parameters of the model.

* Justify your choices in archaeological terms.
* Discuss what kinds of outcomes you expect at the extremes.

:::

::: {.callout-tip}

## Exercise 3: Sampling logic

Assume you can only afford **200 simulation runs**.

* Would you prefer a structured grid or random sampling?
* What questions would each approach allow (or prevent) you from answering?

:::