Skip to content

Pr0mp7/shadowserver-ingestor

Repository files navigation

Shadowserver Ingestor

Fetches scan reports from the Shadowserver API and writes them to PostgreSQL.
Designed to pair with IRIS Data Explorer for DFIR-IRIS case correlation.

Release License Python Docker


Features

  • HMAC-SHA256 authenticated API client
  • CSV report download from dl.shadowserver.org/{id}
  • SHA256-based dedup — no duplicate events on re-ingestion
  • Scheduled ingestion every 15 minutes (configurable)
  • Auto-backfill on first start (default: 7 days)
  • Health endpoint for Docker healthchecks
  • Ingestion audit log — tracks every run with event counts and errors
  • Per-report error handling — one failure doesn't stop other reports

Quick Start

git clone https://github.com/Pr0mp7/shadowserver-ingestor.git
cd shadowserver-ingestor
cp .env.example .env
# Edit .env with your Shadowserver API key/secret and DB credentials
docker compose up -d

Prerequisites

  • PostgreSQL database with a read-write user
  • Shadowserver API key and secret (request access)

The ingestor auto-creates the database schema on first start.

Architecture

Shadowserver API ──HTTPS/HMAC──► shadowserver-ingestor
  reports/list                        │
  dl.shadowserver.org/{id} (CSV)      │ writes every 15min
                                      ▼
                               PostgreSQL (shadowserver_db)
                                      │
                                      │ reads (optional)
                                      ▼
                               IRIS Data Explorer (port 8087)

Configuration

Variable Default Description
SS_API_KEY (required) Shadowserver API key
SS_API_SECRET (required) Shadowserver API secret
SS_API_URL https://transform.shadowserver.org/api2/ API base URL
DB_HOST postgres PostgreSQL host
DB_PORT 5432 PostgreSQL port
DB_NAME shadowserver_db Database name
DB_USER shadowserver_ingestor Database user (needs read-write)
DB_PASSWORD (required) Database password
INGEST_INTERVAL_MINUTES 15 Scheduled ingestion interval
BACKFILL_DAYS 7 Days to backfill on first start
REQUEST_DELAY_SECONDS 1.0 Delay between API calls
HEALTH_PORT 8088 Health endpoint port

CLI Modes

# Test API connectivity
docker exec shadowserver-ingestor python -m ingestor.main --ping

# Run one ingestion cycle and exit
docker exec shadowserver-ingestor python -m ingestor.main --once

# Backfill specific number of days and exit
docker exec shadowserver-ingestor python -m ingestor.main --backfill 30

Database Schema

ss_events — all events from all report types

Hybrid schema: indexed common columns + JSONB for complete data.

Column Type Description
report_type TEXT e.g., scan_ssl, device_id, scan_http
report_date DATE Report date
ip INET Source/target IP
port INTEGER Port number
asn INTEGER Autonomous System Number
geo TEXT Country code
hostname TEXT Hostname
tag TEXT Shadowserver tags
severity TEXT low, medium, high
raw_data JSONB Complete event as received
event_hash TEXT SHA256 for dedup

Dedup: UNIQUE(report_type, report_date, event_hash)

ss_reports — tracks ingested report types per date

Records which reports have been ingested and when, with event counts.

ss_ingestion_log — audit trail

Logs every ingestion run: start/finish timestamps, status, events ingested/skipped, and error messages.

Database Setup

CREATE DATABASE shadowserver_db;
CREATE USER shadowserver_ingestor WITH PASSWORD 'your-password';
GRANT ALL ON DATABASE shadowserver_db TO shadowserver_ingestor;

-- Optional: read-only user for IRIS Data Explorer
CREATE USER shadowserver_viewer WITH PASSWORD 'your-password';
-- (SELECT grants are applied automatically by schema.py)

Related

  • iris-data-explorer — interactive case data explorer that reads from this service's database
  • DFIR-IRIS — the incident response platform

License

LGPL-3.0

About

Standalone Shadowserver API ingestor — fetches scan reports via HMAC-SHA256 auth and writes to PostgreSQL with dedup and scheduled ingestion

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors