-
Notifications
You must be signed in to change notification settings - Fork 0
Rupture Propagation
We often get a bunch of faults that we know participated in a rupture. Here is an example for the Darfield quake.
(Here each colour represents a different fault.)
We might know the hypocentre, or this might be a rupture from the NSHM where we don't know the hypocentre. The problem is ruptures don't start simultaneously; they trigger each other in specific orders. In fact, there are lots of possible orders.
Here's one:

And another:

And one more for luck:

The red arrows make up what's called the rupture propagation tree.
If you're thinking to yourself, "that last one seems a bit silly," you're right! It's totally unphysical. It turns out that the probability of jumping between two given faults is related to their distance from each other. Using this probability, you can derive the probability of a whole rupture tree. Essentially, the rupture propagation problem is two problems:
- If we know a hypocentre, we want to fairly (according to the probability of each rupture tree) sample possible rupture propagation trees starting from the fault containing the hypocentre.
- If we don't know the hypocentre (as in the case of NSHM, or perhaps if you wanted to rerun Darfield with the same faults but different rupture scenarios), we want to sample the rupture trees fairly over every possible initial fault.
The rupture propagation model simulates the probability of different rupture trees by modelling ruptures using graphs. Specifically, every fault/source becomes a node, and the edge weights are probabilities that the rupture could jump between nodes. If we assume that rupture jumps are independent, and allow for a moment that the same fault could be triggered twice, we can draw random numbers for each edge to obtain different subgraphs.

In the above example, we constructed a graph out of five sources, assigned probabilities of rupture jumping between them according to some measure (in our code, we use distance), and then drew a random subgraph
Looking again at our example subgraph
- There is a cycle in
$H$ , which would mean that some source would have to be triggered twice. Because a source cannot be triggered twice, we require$H$ to be acyclic to be a valid rupture propagation scenario. - One of the sources is never triggered. This is not physically impossible (indeed, it's highly likely), but we want to assume that every source is triggered. Therefore, we need
$H$ to be a spanning tree.
To fairly sample rupture propagation trees, we calculate the conditional probability
For realistic scenarios, including the Darfield earthquake, there are thousands or tens of thousands of spanning trees, but most have a trivial probability. We want to sample these spanning trees according to their conditional probability without generating every single one.
The initial fault could be known, unknown, or determined according to a prior distribution over the nodes. To account for this, we independently select one of the sources to be the initial fault after selecting a spanning tree. Selecting an initial fault and tree completely determines the rupture propagation tree, which is defined as a spanning tree with directed edges indicating the direction of rupture propagation. For example, in the graph

To get a feel for this problem, let’s create a simple example with three point sources. Assume we have three point sources

The conditional probabilities for
We can only efficiently sample spanning trees proportional to their weight, which is defined as the product of edge weights in the tree. The probability distribution we obtain is:
At first glance,
[^1]: More formally,
Doing so ensures that:
The term
Now sampling according to the product of weights in networkx.random_spanning_tree (docs).
Suppose that, instead of knowing how many ruptures you want, you want a list of ruptures that represent at least a fixed probability of the total distribution. More formally, given a probability threshold
To achieve this, we collect a list of spanning trees ordered from highest to lowest probability and stop when their cumulative probability meets or exceeds the threshold. Conceptually, the algorithm looks like this:
total_probability_of_spanning_trees = sum(probability(tree) for tree in spanning_trees)
# NOTE: we don’t calculate the total probability like this!
# We use a smarter method that avoids computing all trees.
for tree in spanning_trees_ordered_by_probability(G):
sample.append(tree)
sample_probability += probability(tree) / total_probability_of_spanning_trees # P(T | T is a spanning tree of G)
if sample_probability >= threshold:
break
return sampleIn the toy example, if
NetworkX includes the SpanningTreeIterator class (docs), which allows us to:
Iterate over all spanning trees of a graph in either increasing or decreasing cost.
However, NetworkX defines cost as the sum of edge weights. To align the minimum spanning tree definition of cost with the product definition of tree probability, we can use logarithms. Specifically, we create a graph
This transformation ensures that iterating over the graphs in descending order of summed cost in
To check when the cumulative probability of selected spanning trees meets the threshold nx.number_of_spanning_trees (docs).
Atzori (2010) defines the rupture geometry for the Darfield earthquake as consisting of 8 segments. Below is a map of the segments:

The hypocentre for this event was located on the fault circled in red. For these 8 segments, there are 8575 spanning trees. The real rupture path taken in 2010 corresponds to one of these spanning trees, but we do not know which one. Instead, we must fairly sample the spanning trees based on their probabilities using the new rupture propagation sampling algorithm.
By cumulatively summing the spanning tree probabilities, we can construct the cumulative distribution function (CDF) for

This type of distribution is typical: only the first 10 or so rupture scenarios have non-trivial probabilities, while the remaining scenarios involve such improbable rupture jumps that they are not worth simulating. Among these, three rupture scenarios dominate, accounting for the majority of the probability mass.
-
Top Scenario Probability: 28% Visualisation:

-
Second Most Likely Scenario Probability: 27% Visualisation:

-
Third Most Likely Scenario Probability: 25% Visualisation:

-
Fourth Most Likely Scenario Probability: 2% Visualisation:

The top three rupture scenarios account for approximately 80% of the total probability mass, making them the most plausible candidates for simulation. The remaining spanning trees, although numerous, contribute negligibly to the total probability distribution and can probably be disregarded in practical applications.