Skip to content

avi-starkware/storage-leak-debug

Repository files navigation

Aerospike Storage Cost Debugger

Samples or fully scans records from an Aerospike namespace (where object types are encoded as key prefixes, not separate sets) and reports the true storage cost per type, including per-record overhead and primary-index memory.

Setup (local analysis tools)

python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

This installs dependencies for the local analysis scripts (analyze_storage_report.py, generate_dashboard.py, dashboard.py). The main scan script (storage_leak_debug.py) runs on the pod using the pod's existing Python environment and has no extra dependencies.

Workflow

1. Copy scripts to the pod

kubectl cp storage_leak_debug.py <pod>:/data/storage-leak-debug/storage_leak_debug.py

2. Run on the pod

The script must be run through IPython on the pod (for access to starkware packages):

kubectl exec -it <pod> -- bash
$(find . -name "*ipython_py_binary") -c "%run /data/storage-leak-debug/storage_leak_debug.py [flags]"

Examples:

# Quick overview (10K samples)
$(find . -name "*ipython_py_binary") -c "%run /data/storage-leak-debug/storage_leak_debug.py"

# Larger sample for more accuracy
$(find . -name "*ipython_py_binary") -c "%run /data/storage-leak-debug/storage_leak_debug.py -n 50000 -o /data/storage-leak-debug/"

# Full scan of all records (hours on large DBs)
$(find . -name "*ipython_py_binary") -c "%run /data/storage-leak-debug/storage_leak_debug.py --full-scan -o /data/storage-leak-debug/"

# Full scan + dump specific type's keys for deletion/archival
$(find . -name "*ipython_py_binary") -c "%run /data/storage-leak-debug/storage_leak_debug.py --full-scan -o /data/storage-leak-debug/ --write-records /data/storage-leak-debug/tx_exec_v2.csv --write-records-filter tx_execution_info_version2"

# Analyze the merkle-facts set
$(find . -name "*ipython_py_binary") -c "%run /data/storage-leak-debug/storage_leak_debug.py --merkle-facts"

# Just the console summary, no CSVs
$(find . -name "*ipython_py_binary") -c "%run /data/storage-leak-debug/storage_leak_debug.py --no-csv"

Running in background (survives disconnects and auth expirations):

Full scans take hours. Rather than keeping an interactive session open, launch the script in the background with nohup so it continues even if your kubectl exec session disconnects:

kubectl exec <pod> -- bash -c 'nohup $(find . -name "*ipython_py_binary") -c "%run /data/storage-leak-debug/storage_leak_debug.py --full-scan -o /data/storage-leak-debug/" > /data/storage-leak-debug/output.log 2>&1 &'

This returns immediately. Check progress and completion with:

# Tail the log
kubectl exec <pod> -- tail -1 /data/storage-leak-debug/output.log

# Check if DONE marker exists (written on successful completion)
kubectl exec <pod> -- ls /data/storage-leak-debug/DONE

If the pod restarts or the script is killed, re-run with --resume to continue from the last checkpoint:

kubectl exec <pod> -- bash -c 'nohup $(find . -name "*ipython_py_binary") -c "%run /data/storage-leak-debug/storage_leak_debug.py --full-scan --resume -o /data/storage-leak-debug/" > /data/storage-leak-debug/output.log 2>&1 &'

Options:

Flag Default Description
-n, --samples 10000 Number of records to sample
-o, --output-dir /tmp/ Directory for CSV output
--full-scan off Scan all records instead of sampling. Ignores -n
--resume off Resume a previously interrupted scan from the last checkpoint
--record-overhead 64 Per-record on-disk overhead in bytes (digest + metadata + bin headers)
--merkle-facts off Analyze the merkle-facts set
--no-csv off Print summary only, skip CSV generation
--write-records PATH off Write each record's key and device size to a CSV
--write-records-filter TYPE off Only write records matching this object type

Output files (written to --output-dir):

  • type_summary.csv -- per-type breakdown: count, device size, value size, index cost
  • DONE -- marker file with the full summary, written on successful completion
  • checkpoint.json -- resume state, written after each partition (deleted on completion)

When --write-records is used:

  • <path>.csv -- one row per record: key,size (device size including overhead)

3. Copy results locally

kubectl cp <pod>:/data/storage-leak-debug/type_summary.csv .
kubectl cp <pod>:/data/storage-leak-debug/DONE .          # verify completion

kubectl cp does not compress and can fail on large files (it uses tar internally, which may hit memory/timeout limits). For large files like --write-records output:

# 1. Compress on the pod (use -1 for fast compression, ~3-4x faster than default)
kubectl exec <pod> -- gzip -1 /data/storage-leak-debug/records.csv

# 2. If the .gz is under ~5GB, download directly:
kubectl exec <pod> -c batcher -- cat /data/storage-leak-debug/records.csv.gz > records.csv.gz

# 3. If the .gz is larger (or download truncates), split on the pod first:
kubectl exec <pod> -- bash -c "split -b 2G /data/storage-leak-debug/records.csv.gz /data/storage-leak-debug/records.csv.gz.part-"

# Download each chunk (use cat, not kubectl cp, for reliability):
kubectl exec <pod> -- ls /data/storage-leak-debug/records.csv.gz.part-*
for part in aa ab ac ad ae af ag ah ai aj; do
  kubectl exec <pod> -c batcher -- cat /data/storage-leak-debug/records.csv.gz.part-$part > records.csv.gz.part-$part || break
done

# Reassemble and verify:
cat records.csv.gz.part-* > records.csv.gz
gzip -t records.csv.gz   # integrity check
gunzip records.csv.gz

# Clean up parts on the pod:
kubectl exec <pod> -- rm /data/storage-leak-debug/records.csv.gz.part-*

4. Generate visual report locally

analyze_storage_report.py reads type_summary.csv and produces a PNG with:

  • KPI strip (total device size, value, overhead, index memory, record count)
  • Cost breakdown by type (stacked bars: value / overhead / index)
  • Object count by type
pip install matplotlib

python3 analyze_storage_report.py -i . -o storage_report.png

# Both main + merkle-facts sets
python3 analyze_storage_report.py -i . --both

Understanding the output

Device size vs value size

len(bin_data["value"]) (value size) is what your application wrote. The actual storage cost per record is higher:

device_size = value_size + record_overhead

The default overhead of 64 bytes covers the Aerospike digest (20B), record metadata (~26B), and single-bin header (~12B). Adjust with --record-overhead if your setup differs (check asinfo -v "namespace/<ns>" for actual usage vs object count).

On top of device storage, each record costs ~64 bytes in the primary index (kept in memory). For types with many small records (e.g. tx_request_data at ~16 bytes/value), the index memory cost dominates the value size.

Sampling

All scans use partition-filter-based iteration (each of the 4096 partitions scanned independently), which ensures each record is visited exactly once and avoids replica double-counting. When sampling, each partition returns an equal share of records for uniform coverage across the cluster.

Full scan

Scans every record in the namespace. The script queries the cluster for the total object count at startup (summing across all nodes, dividing by the replication factor) and shows progress with a percentage. Memory usage is constant — records are aggregated on the fly, not held in a list.

A DONE marker file containing the full summary is written to the output directory on successful completion, so you can verify the script finished even if the pod restarts.

Resuming interrupted scans

Full scans of large databases can take hours. If the pod crashes or the script is interrupted, use --resume to continue from the last checkpoint:

$(find . -name "*ipython_py_binary") -c "%run /data/storage-leak-debug/storage_leak_debug.py --full-scan --resume -o /data/storage-leak-debug/"

The script saves a checkpoint.json file after each partition completes (atomic write via tmp + rename). On resume, it loads the checkpoint, restores all aggregation counters, and continues from the next partition. The records file (if --write-records is used) is opened in append mode.

Safety: --resume without a checkpoint exits immediately rather than starting a fresh scan. This prevents accidentally overwriting an existing records file after a completed scan.

Auto-start on pod boot

Copy run_scan.sh to the PVC and add a lifecycle hook to the container spec:

lifecycle:
  postStart:
    exec:
      command: ["/bin/bash", "/data/storage-leak-debug/run_scan.sh"]

The script checks for a DONE marker and exits immediately if the scan already completed. Otherwise it launches the scan in the background with --resume. Progress is logged to output.log (line-based, no terminal escape codes) so you can monitor with kubectl exec <pod> -- tail /data/storage-leak-debug/output.log.

Writing records for deletion/archival

--write-records dumps each record's full Aerospike key and device size to a CSV. Use --write-records-filter to limit to a specific object type. The key is written as the ASCII string (or hex:-prefixed hex for binary keys) and can be reconstructed for deletion via bytearray(key.encode("ascii")).

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors