A comprehensive Python toolkit for analysing temporal (dynamic) networks using Stochastic Block Models (SBM) and hypergraph group extraction.
This project was developed as an examination work for the course Network Data Analysis taught by Prof. Maria Francesca Marino at the University of Florence, within the Master's programme in Data Science and Statistical Learning MD2SL.
The toolkit implements rigorous statistical methods for network community detection, combining classical network analysis with modern inference-based approaches. It also includes an optional hypergraph analysis module that extracts group interactions via maximal clique enumeration, following the methodology of Iacopini et al. (2022). The complete pipeline transforms raw temporal edge data into publication-quality visualisations and comprehensive statistical reports.
- Features
- Installation
- Quick Start
- Input Data Format
- Output
- Example: LyonSchool Dataset
- Configuration
- Theoretical Background ← Separate document
- References
- Acknowledgments
- License
This toolkit provides a comprehensive suite of network analysis methods, structured around three principal pillars: static network analysis, stochastic block model inference, and temporal dynamics. Each component has been designed to offer rigorous statistical foundations whilst maintaining computational efficiency and ease of use.
The foundation of any network analysis begins with a thorough understanding of basic structural properties. The toolkit computes global statistics including the number of nodes, edges, network density, and the identification of connected components. Distance metrics such as network diameter and average path length provide insight into the overall navigability of the graph. Degree analysis encompasses the full distribution of node degrees alongside summary statistics including mean, standard deviation, minimum, maximum, and median values.
Clustering behaviour is assessed through both the local clustering coefficient and the global transitivity measure, offering complementary perspectives on triadic closure within the network. The toolkit further implements a comprehensive suite of centrality measures: degree centrality captures immediate connectivity, betweenness centrality quantifies the extent to which nodes lie on shortest paths between others, closeness centrality measures the average distance from each node to all others, and eigenvector centrality identifies nodes connected to other well-connected nodes. Finally, network centralisation is computed using Freeman's index, which quantifies the extent to which the network structure is dominated by a single node or small group of nodes.
Moving beyond heuristic community detection methods, this toolkit employs principled Bayesian inference for network partitioning. The optimal number of blocks is determined automatically through the Minimum Description Length (MDL) criterion, which balances model complexity against goodness of fit. Nodes are assigned to blocks via a hard partition derived from the maximum a posteriori (MAP) estimate obtained through Markov Chain Monte Carlo (MCMC) inference.
The inter-block connection probability matrix Π characterises the propensity for edges to form between and within blocks, whilst internal block density analysis reveals the cohesiveness of each community. For comparison with alternative model selection criteria, the Integrated Classification Likelihood (ICL) is also computed.
For temporal networks, the toolkit implements a sliding-window approach inspired by the dynsbm methodology. Temporal data are partitioned into overlapping time windows, with an independent SBM fitted to each snapshot. A critical challenge in this setting is the correspondence problem: block labels are arbitrary within each window, making direct comparison across time points problematic. This is addressed through label alignment using the Hungarian algorithm, which finds the optimal permutation of labels to maximise consistency between consecutive windows.
The aligned block assignments enable the computation of a transition probability matrix
Many real-world social interactions involve more than two individuals simultaneously. Whilst temporal edge lists record only pairwise contacts, it is possible to approximate higher-order group interactions by extracting cliques from aggregated time windows, following the methodology of Iacopini et al. (2022).
The underlying rationale is straightforward: within a given time window, if individuals A, B, and C all interact pairwise (A↔B, B↔C, A↔C), they have likely participated in a group interaction. This pattern corresponds to a clique—that is, a complete subgraph—in the contact network. The toolkit extracts maximal cliques from each time-window snapshot, where each clique of size k represents a k-person group interaction, or equivalently, a hyperedge of order k. From these extractions, the distribution of group sizes across all windows is computed, and the temporal evolution of group sizes is tracked throughout the observation period.
This analysis is optional and may be enabled with the --hypergraph flag.
All analyses are accompanied by publication-ready visualisations designed to communicate results effectively. For degree analysis, both linear and log-log scale distributions are produced. Centrality measures are presented through comparison scatter plots alongside network graphs where node sizes reflect centrality values. The SBM results are visualised through community structure diagrams and block connection probability heatmaps.
Temporal aspects of the data are captured through activity timelines, whilst the dynamic SBM results are presented via transition heatmaps and block evolution plots. When hypergraph analysis is enabled, additional figures display the group size distribution and the temporal trajectory of group sizes (median and interquartile range per window). An optional animated network evolution visualisation is also available, with configurable resolution settings.
Further details regarding output filenames may be found in the Output section below.
- Python 3.8+
- Linux or WSL (Windows Subsystem for Linux) — required for graph-tool
- 4GB RAM recommended for large networks
The core of this toolkit relies on graph-tool, a highly efficient C++ library for network analysis. It cannot be installed via pip.
Conda (recommended — works reliably on Linux, macOS, and WSL)
conda create -n netsbm python=3.10
conda activate netsbm
conda install -c conda-forge graph-toolAlternative: Ubuntu/Debian via apt (may require additional configuration)
sudo apt update
sudo apt install -y software-properties-common
sudo add-apt-repository universe
sudo apt update
sudo apt install python3-graph-tool
⚠️ Theaptmethod may fail on some Ubuntu versions. If you encounter issues, use Conda instead.
Docker (containerized)
docker pull tiagopeixoto/graph-tool
docker run -it tiagopeixoto/graph-tool python3With your conda environment activated:
conda activate netsbm
pip install -r requirements.txtrequirements.txt includes NumPy, SciPy, Matplotlib, pandas, PyYAML, and NetworkX (used for clique enumeration in hypergraph analysis).
⚠️ Important: Always install packages inside the activated conda environment. Do not usesudowith pip or conda.
git clone https://github.com/battles5/temporal-network-sbm.git
cd temporal-network-sbmThis toolkit requires WSL (Windows Subsystem for Linux):
- Open PowerShell as Administrator and run:
wsl --install - Install Ubuntu from Microsoft Store (22.04 or 24.04)
- Inside WSL, install Miniconda and then graph-tool via conda-forge:
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh bash Miniconda3-latest-Linux-x86_64.sh conda create -n netsbm python=3.10 conda activate netsbm conda install -c conda-forge graph-tool
- Run the toolkit from within WSL
Note: This repository does not include datasets. Download a temporal network dataset before running the analysis.
Recommended: LyonSchool (primary school face-to-face contacts)
Other compatible datasets: SFHH, InVS, Thiers, Hospital (all from SocioPatterns).
Setup: Create the data/ folder and place your downloaded dataset there:
mkdir -p data
# Place the downloaded dataset here, e.g.:
# data/tij_LyonSchool.datThe toolkit expects a simple edge list format:
timestamp node1 node2
0 1 2
0 2 3
20 1 3
...
python main.py --input <data_file> --output <output_dir># Run analysis on LyonSchool (download from SocioPatterns first)
python main.py --input data/tij_LyonSchool.dat --output output/
# With custom configuration
python main.py --input data/mydata.dat --output output/ --config my_config.yaml
# Generate network animation (resource-intensive)
python main.py --input data/mydata.dat --output output/ --animate
# Skip dynamic SBM for faster execution
python main.py --input data/mydata.dat --output output/ --no-dynamic-sbm
# Enable hypergraph group extraction (clique analysis)
python main.py --input data/mydata.dat --output output/ --hypergraph
# Hypergraph with custom group size limits
python main.py --input data/mydata.dat --output output/ --hypergraph --min-group-size 4 --max-group-size 15| Argument | Short | Description |
|---|---|---|
--input |
-i |
Path to input data file (required) |
--output |
-o |
Path to output directory (required) |
--config |
-c |
Path to YAML configuration file |
--animate |
Generate network animation (MP4/GIF) | |
--no-dynamic-sbm |
Skip dynamic SBM analysis | |
--hypergraph |
Enable hypergraph group extraction via cliques | |
--min-group-size |
Minimum clique size (default: 3) | |
--max-group-size |
Maximum clique size (default: 20) | |
--max-cliques-per-window |
Safety limit for clique enumeration |
The toolkit is designed to be generic — it works with any temporal edge list, not hardcoded to specific datasets.
A text file with three columns:
timestamp node1 node2
31220 1 42
31220 5 12
31240 1 42
...
| Column | Type | Description |
|---|---|---|
timestamp |
Integer | Timestamp (seconds, UNIX time, or any integer) |
node1 |
Integer | ID of the first node |
node2 |
Integer | ID of the second node |
The parser auto-detects common separators:
- Space (default)
- Tab (
\t) - Comma (
,) - Semicolon (
;)
You can also specify the separator explicitly in config.yaml.
This format is compatible with many public network datasets:
- SocioPatterns (LyonSchool, SFHH, InVS, Thiers, Hospital, etc.)
- Socio-economic networks (email-EU, Congress bills)
- Any temporal edge list with the above structure
The toolkit generates a comprehensive analysis package:
output/
├── figures/
│ ├── degree_distribution.png
│ ├── centrality_comparison.png
│ ├── centrality_network.png
│ ├── community_sbm.png
│ ├── sbm_block_matrix.png
│ ├── temporal_activity.png
│ ├── dynamic_sbm_transitions.png
│ ├── dynamic_sbm_evolution.png
│ ├── group_size_distribution.png (if --hypergraph)
│ ├── group_size_over_time.png (if --hypergraph)
│ └── network_animation.mp4 (if --animate)
├── metrics.csv
├── top_nodes.csv
├── sbm_results.csv
├── dynamic_sbm_windows.csv
├── dynamic_sbm_transitions.csv
├── dynamic_sbm_stability.csv
├── hypergraph_groups.csv (if --hypergraph)
├── group_size_distribution.csv (if --hypergraph)
└── summary.txt
| File | Description |
|---|---|
metrics.csv |
All network-level metrics in machine-readable format |
top_nodes.csv |
Top 20 nodes ranked by each centrality measure |
sbm_results.csv |
Block assignments, sizes, and internal densities |
dynamic_sbm_windows.csv |
Per-window SBM results (blocks, edges, MDL) |
dynamic_sbm_transitions.csv |
Block-to-block transition probability matrix |
dynamic_sbm_stability.csv |
Stability score for each block |
hypergraph_groups.csv |
All extracted groups: window_id, group_id, group_size, node_ids |
group_size_distribution.csv |
Distribution of group sizes: size, count, proportion |
summary.txt |
Human-readable comprehensive report |
To demonstrate the toolkit's capabilities, we analyse the LyonSchool dataset from the SocioPatterns project.
The LyonSchool dataset captures face-to-face interactions between students and teachers in a primary school in Lyon, France, recorded using RFID sensors over two consecutive school days.
| Property | Value |
|---|---|
| Nodes | 242 (students + teachers) |
| Temporal edges | 125,773 |
| Unique static edges | 8,317 |
| Duration | ~2 days (~32 hours) |
| Time resolution | 20 seconds |
# Standard analysis (SBM + Dynamic SBM)
python main.py \
--input data/tij_LyonSchool.dat \
--output output/
# With hypergraph group extraction
python main.py \
--input data/tij_LyonSchool.dat \
--output output/ \
--hypergraphBelow are the complete results obtained from our analysis pipeline.
Reproducibility note: These values are from one run with default settings (window size 300s, step 60s). Results may vary slightly depending on MCMC initialization and parameter choices.
| Metric | Value |
|---|---|
| Nodes | 242 |
| Edges | 8,317 |
| Density | 0.2852 |
| Connected | Yes (single component) |
| Diameter | 3 |
| Average Path Length | 1.732 |
The network is dense (28.5% of possible edges exist) and highly connected — any two individuals can reach each other in at most 3 hops, with an average of less than 2.
| Statistic | Value |
|---|---|
| Mean degree | 68.74 |
| Std deviation | 26.57 |
| Min degree | 20 |
| Max degree | 134 |
| Median | 68.5 |
The degree distribution is unimodal and approximately symmetric, suggesting relatively homogeneous participation across individuals. The log-log plot shows the network does not follow a power-law distribution, which is typical for social contact networks in closed environments (schools, workplaces).
| Metric | Value |
|---|---|
| Clustering Coefficient | 0.5255 |
| Transitivity | 0.4798 |
The high clustering coefficient (52.5%) indicates strong local cohesion — if person A interacts with B and C, there's a high probability that B and C also interact. This is characteristic of school environments where students form tight-knit groups.
The scatter plot reveals strong correlation between degree and eigenvector centrality, meaning well-connected individuals are connected to other well-connected individuals — a signature of core-periphery structure.
Network visualisation with node sizes proportional to degree centrality. The graph reveals a relatively dense core with some peripheral nodes.
| Centralization Measure | Value | Interpretation |
|---|---|---|
| Degree | 0.2731 | Moderately centralized |
| Betweenness | 0.0103 | Low centralization |
| Closeness | 0.1128 | Slightly centralized |
The network is slightly centralized overall. The low betweenness centralization (~1%) indicates no single node dominates shortest paths, which is expected in a dense network with short diameter. The moderate degree centralization indicates some individuals are more socially active than others.
| Rank | Node ID | Degree Centrality |
|---|---|---|
| 1 | 1551 | 0.556 |
| 2 | 1780 | 0.535 |
| 3 | 1761 | 0.531 |
| 4 | 1673 | 0.514 |
| 5 | 1665 | 0.510 |
| 6 | 1552 | 0.510 |
| 7 | 1579 | 0.506 |
| 8 | 1700 | 0.502 |
| 9 | 1890 | 0.498 |
| 10 | 1765 | 0.490 |
Note: Node IDs (1551, 1780, etc.) are the original participant identifiers from the SocioPatterns dataset, not sequential indices. The 242 unique participants have IDs ranging from 1426 to 1922.
These individuals had contact with more than 50% of the school population during the observation period.
The SBM infers the latent community structure using Bayesian inference with the Minimum Description Length criterion.
| SBM Parameter | Value |
|---|---|
| Optimal number of blocks | 18 |
| Description Length (MDL) | 15,827.51 bits |
| ICL | -11,510.71 |
Note: The toolkit computes both MDL (from
graph-tool) and ICL (following course notation). MDL is used for model selection during inference; ICL is reported for comparison with the course material.
When external node attributes are available (e.g., school class labels), the toolkit computes the true attribute assortativity coefficient following the course definition:
where:
-
$e_{ij}$ = fraction of edges connecting class$i$ to class$j$ -
$a_i = \sum_j e_{ij}$ = fraction of edge endpoints in class$i$ -
$\text{Tr}(e) = \sum_i e_{ii}$ = fraction of edges within same class
| Attribute Assortativity | Value | Description |
|---|---|---|
| Number of classes | 11 | 1A, 1B, 2A, 2B, 3A, 3B, 4A, 4B, 5A, 5B, Teachers |
| Modularity Q | 0.2111 | Same-class edges above random |
| Modularity Q_max | 0.9031 | Maximum possible |
| Assortativity r = Q/Q_max | 0.2338 | Moderate assortative mixing |
Interpretation:
The toolkit also reports partition modularity based on the inferred SBM blocks (not external attributes):
| Partition Metric | Value |
|---|---|
| Partition Modularity Q | 0.1247 |
| Partition Q_max | 0.9306 |
| Partition Assortativity | 0.134 |
Important distinction: Partition modularity measures the quality of the SBM partition, while attribute assortativity measures true homophily based on known node labels. These are different concepts!
| Block | Size | Internal Density |
|---|---|---|
| Block 0 | 8 | 0.857 |
| Block 1 | 11 | 1.000 |
| Block 2 | 11 | 0.946 |
| Block 3 | 15 | 1.000 |
| Block 4 | 13 | 1.000 |
| Block 5 | 16 | 0.983 |
| Block 6 | 23 | 0.968 |
| Block 7 | 16 | 0.950 |
| Block 8 | 11 | 1.000 |
| Block 9 | 16 | 0.992 |
| Block 10 | 16 | 0.992 |
| Block 11 | 20 | 0.805 |
| Block 12 | 8 | 1.000 |
| Block 13 | 11 | 0.982 |
| Block 14 | 8 | 1.000 |
| Block 15 | 17 | 0.993 |
| Block 16 | 6 | 1.000 |
| Block 17 | 16 | 0.983 |
The SBM identifies 18 blocks, which is more than the 11 actual school classes. This finer granularity suggests the model detects sub-groups within classes (e.g., friend clusters, seating arrangements) or distinguishes students with different interaction patterns. The internal density close to 1.0 indicates that students within the same block interact with almost everyone else in their block.
Note on density = 1.0: An internal density of 1.0 means the block is a complete subgraph (clique) — every pair of nodes within the block has an edge. This is expected in school classes where all students had at least one contact during the two-day observation period.
Figure: Connection Probability Matrix Π̂. Each cell π̂_qℓ represents the estimated probability of an edge between a node in block q and a node in block ℓ.
The block connection matrix shows:
- Strong diagonal (assortative structure): students primarily interact within their class
- Off-diagonal connections: inter-class interactions during breaks, lunch, etc.
| Temporal Metric | Value |
|---|---|
| Duration | 1,948.3 minutes (~32.5 hours) |
| Unique timestamps | 3,100 |
| Time windows | 1,039 |
| Average edges/window | 265.1 |
| Peak activity | 554 edges |
The temporal activity shows clear periodic patterns:
- Peaks during school hours (class activities, breaks)
- Valleys during night hours
- Two main activity clusters corresponding to the two school days
The Dynamic SBM analyses how community structure evolves over time by fitting independent SBMs to each time window and aligning labels across consecutive windows. It is important to note that this analysis is fundamentally different from the static SBM described above: whilst the static SBM analyses the entire aggregated network (all interactions over two days, yielding 18 blocks corresponding to school classes), the Dynamic SBM fits a separate model to each temporal window. Since each window captures only a snapshot of activity (39 minutes), fewer nodes are active and consequently fewer blocks are detected per window (typically between 3 and 10, with a configured maximum of 10).
Figure: Block Size Heatmap. Number of nodes in each block (rows) across time windows (columns). Darker cells indicate larger blocks; white cells indicate absent blocks.
Interpretation notes:
- Each row represents a block label (0, 1, 2, ...) as assigned after Hungarian algorithm alignment
- The number of active blocks varies across time windows depending on network activity
- Low activity periods (windows ~17–27, corresponding to night): 1–3 blocks detected
- High activity periods (school hours): up to 10 blocks detected
Caveat on label alignment: The Hungarian algorithm aligns labels locally between consecutive windows
$(t, t+1)$ , which can cause "label drift" over longer periods. A block that appears as "Block 0" at$t=0$ may be relabeled as "Block 3" by$t=20$ due to accumulated misalignments. The transition matrix and heatmap should be interpreted with this limitation in mind.
Figure: Transition Probability Matrix P̂. Each cell p̂_rs = P(b_{t+1}=s | b_t=r) represents the estimated probability that a node moves from block r at time t to block s at time t+1.
The transition matrix shows block-to-block movement probabilities. Key observations:
- Diagonal dominance: nodes tend to stay in their blocks (stable class membership)
- Some off-diagonal flow: students occasionally interact with other classes
- Block stability ranges from 35% to 80% depending on the class
| Windows Analysed | 50 |
|---|---|
| Block stability range | 0.35 – 0.80 |
| Nodes that changed blocks | 242 (all) |
All 242 nodes changed blocks at least once, reflecting the natural dynamics of school life — students temporarily join different groups during breaks, lunch, or cross-class activities.
When running with --hypergraph, the toolkit extracts group interactions by identifying maximal cliques in each time window.
python main.py --input data/tij_LyonSchool.dat --output output/ --hypergraphFigure: Distribution of group sizes extracted via clique enumeration. Left: linear scale; Right: log scale.
| Hypergraph Metric | Value |
|---|---|
| Total groups extracted | 73,950 |
| Most common group size | 3 (triangles) |
| Largest groups observed | 10–15 individuals |
| Windows with groups | 1,028 / 1,039 |
| Analysis time | ~3 seconds |
The group size distribution typically follows a power-law-like decay: many small groups (triads, tetrads) and progressively fewer large groups. This is consistent with the observation that full-class interactions are rare, while small-group conversations are frequent.
Figure: Median group size per time window with interquartile range (IQR). The pattern mirrors overall activity — larger groups form during peak school hours.
Customize the analysis by editing config.yaml:
# Input file parsing
input:
separator: "auto" # auto, space, tab, comma
skip_header: false
# Time window parameters
time_windows:
window_size_seconds: 300 # 5 minutes
window_step_seconds: 60 # 1 minute overlap
# Static SBM parameters
sbm:
max_blocks: 20
equilibrate: true # More accurate but slower
# Dynamic SBM parameters
dynamic_sbm:
enabled: true
max_windows: 50 # Number of windows to analyse
max_blocks: 10 # Max blocks per window
# Visualisation settings
visualisation:
dpi: 300
format: "png"
# Animation settings
animation:
enabled: false
max_frames: 100
fps: 10
resolution: [1920, 1080] # HD, use [3840, 2160] for 4K
# Hypergraph group extraction (clique-based)
# See: Iacopini et al. (2022) https://doi.org/10.1038/s42005-022-00845-y
hypergraph:
enabled: false # Enable with --hypergraph flag
min_group_size: 3 # Minimum clique size (3 = triangles+)
max_group_size: 20 # Safety limit for large cliques
max_cliques_per_window: 10000 # Safety limit per window| Option | Description |
|---|---|
--input, -i |
Path to input temporal edge list (required) |
--output, -o |
Path to output directory (required) |
--config, -c |
Path to configuration YAML file |
--animate |
Generate network animation (slow) |
--no-dynamic-sbm |
Skip dynamic SBM analysis |
--hypergraph |
Enable hypergraph group extraction via cliques |
--min-group-size |
Minimum group/clique size (default: 3) |
--max-group-size |
Maximum group/clique size (default: 20) |
--max-cliques-per-window |
Safety limit for clique enumeration (default: 10000) |
📚 The complete theoretical background has been moved to a separate document for clarity.
➡️ See THEORY.md for the full mathematical foundations, including:
- Basic notation and adjacency matrices
- Global statistics: density, reciprocity, transitivity
- Node centrality and network centralisation (Freeman's index)
- Stochastic Block Model (SBM): latent variables, VEM, ICL
- SBM vs modularity-based methods
- Assortativity coefficient (attribute vs partition)
- Temporal extension (Dynamic SBM)
- Hypergraph group extraction via cliques
All notation follows the Network Data Analysis course (M.F. Marino, University of Florence).
Holland, P. W., & Leinhardt, S. (1976). Local structure in social networks. Sociological Methodology, 7, 1–45.
Holland, P. W., & Leinhardt, S. (1981). An exponential family of probability distributions for directed graphs. Journal of the American Statistical Association, 76(373), 33–50. https://doi.org/10.1080/01621459.1981.10477598
Holland, P. W., Laskey, K. B., & Leinhardt, S. (1983). Stochastic blockmodels: First steps. Social Networks, 5(2), 109–137.
Freeman, L. C. (1979). Centrality in social networks: Conceptual clarification. Social Networks, 1(3), 215–239. https://doi.org/10.1016/0378-8733(78)90021-7
Hoff, P. D., Raftery, A. E., & Handcock, M. S. (2002). Latent space approaches to social network analysis. Journal of the American Statistical Association, 97(460), 1090–1098. https://doi.org/10.1198/016214502388618906
Nowicki, K., & Snijders, T. A. B. (2001). Estimation and prediction for stochastic blockstructures. Journal of the American Statistical Association, 96(455), 1077–1087. https://doi.org/10.1198/016214501753208735
Daudin, J. J., Picard, F., & Robin, S. (2008). A mixture model for random graphs. Statistics and Computing, 18(2), 173–183. https://doi.org/10.1007/s11222-007-9060-7
Airoldi, E. M., Blei, D. M., Fienberg, S. E., & Xing, E. P. (2008). Mixed membership stochastic blockmodels. Journal of Machine Learning Research, 9, 1981–2014. https://jmlr.org/papers/v9/airoldi08a.html
Frank, O., & Strauss, D. (1986). Markov graphs. Journal of the American Statistical Association, 81(395), 832–842. https://doi.org/10.1080/01621459.1986.10478342
Matias, C., & Miele, V. (2017). Statistical clustering of temporal dynamic networks. Statistics and Computing, 27(4), 1065–1086.
Iacopini, I., Petri, G., Baronchelli, A., & Barrat, A. (2022). Group interactions modulate critical mass dynamics in social convention. Communications Physics, 5, 64. https://doi.org/10.1038/s42005-022-00845-y
Peixoto, T. P. (2014). The graph-tool Python library. https://graph-tool.skewed.de/
Peixoto, T. P. (2014). Efficient Monte Carlo and greedy heuristic for the inference of stochastic block models. Physical Review E, 89(1), 012804.
SocioPatterns Collaboration. (n.d.). SocioPatterns. http://www.sociopatterns.org/
Stehlé, J., Voirin, N., Barrat, A., Cattuto, C., Isella, L., Pinton, J.-F., Quaggiotto, M., Van den Broeck, W., Régis, C., Lina, B., & Vanhems, P. (2011). High-resolution measurements of face-to-face contact patterns in a primary school. PLoS ONE, 6(8), e23176.
Marino, M. F. (2024–2025). Network Data Analysis [Lecture slides]. Master's Degree in Data Science and Statistical Learning (MD2SL), Dipartimento di Statistica, Informatica, Applicazioni (DiSIA), Università degli Studi di Firenze.
We would like to thank Prof. Maria Francesca Marino for the excellent course material. The theoretical foundations presented in her lectures on network data analysis, stochastic block models, and community detection provided the essential framework for this implementation.
We also thank the developers and maintainers of the graph-tool library. Initially, we attempted to use LaNet-vi for network visualisation, but despite extensive efforts, we were unable to get it to work properly (segmentation faults and compatibility issues). Thanks to graph-tool, we were able to complete this project entirely in Python — work that would otherwise have required R and its ecosystem of network analysis packages.
Orso Peruzzi & Giovanni Di Donato
Master's students in Data Science and Statistical Learning (MD2SL)
IMT School for Advanced Studies Lucca & University of Florence, Italy
This project is licensed under the MIT License — see the LICENSE file for details.
For the full Python dependency list, see requirements.txt.
| Library | License | Usage |
|---|---|---|
| graph-tool | LGPL v3 | Network analysis and SBM inference |
| NumPy | BSD | Numerical computations |
| SciPy | BSD | Scientific computing |
| Matplotlib | PSF/BSD | Visualisation |
| pandas | BSD | Data manipulation |
| PyYAML | MIT | Configuration parsing |
| NetworkX | BSD | Clique enumeration (hypergraph) |
These libraries retain their original licenses. Our MIT license applies only to the original code in this repository.
Last updated: January 2026










