SecFlow

Fully Automated Multi-Vector Threat Analysis Pipeline

Drop in any file, URL, IP, domain, or image. SecFlow routes it through specialized analyzers, reasons about findings with AI, and produces a professional security report with YARA rules, SIGMA rules, and exportable PDF — automatically.

Quick Start · Architecture · Report Output · Docs

What is SecFlow?

SecFlow is an open-source automated threat analysis pipeline built for security analysts, SOC teams, and researchers. Instead of manually running disparate tools and correlating results, SecFlow:

Classifies your input using deterministic rules (no AI on pass 1)
Routes it through the right specialized analyzer via HTTP microservices
Reasons about each pass's findings with Groq AI to decide the next step
Repeats — following IOCs, downloading payloads, pivoting across analyzers
Reports — generates a PWNDoc HTML report with YARA detection rules, SIGMA SIEM rules, MITRE ATT&CK mappings, and one-click PDF export

Features

Feature	Detail
AI-Driven Routing	Groq `qwen/qwen3-32b` decides the next analyzer after each pass — no manual configuration
5 Specialized Analyzers	Malware · Steganography · Reconnaissance · Web Vulnerability · Macro/Office
Smart First-Pass	`file` + `python-magic` deterministic rules on pass 1 — AI only called when type is ambiguous
Download-and-Analyze	Follows IOCs — downloads payloads found in raw output and routes them through the right analyzer
YARA Rule Generation	Auto-generates 2–5 deployable YARA rules per analysis, each citing the exact evidence that drove it
SIGMA Rule Generation	Auto-generates 2–4 SIGMA rules for Splunk / Elastic / Sentinel — covering different log sources
MITRE ATT&CK Mapping	Every finding mapped to real TTP IDs with tactic names
Dual Report Formats	HTML report (print-to-PDF in browser) + structured JSON report (feed directly to AI for further analysis)
React Dashboard	Full frontend UI — submit analyses, view live pipeline progress, browse results per analyzer
VirusTotal Integration	Both Malware and Macro analyzers query 70+ AV engines via VT API v3
Configurable Loop Depth	3, 4, or 5 passes — exits early if AI signals no further signals
Standalone Mode	Every analyzer microservice exposes its own REST API — use them independently

Architecture

User Input (file / URL / IP / domain / image)
        │
        ▼
┌────────────────────────────────┐
│   Input Classifier             │  file + python-magic → deterministic rule
│   (Rule-based, pass 1 only)    │  unknown type? → Groq AI fallback
└───────────────┬────────────────┘
                │  first analyzer selected
                ▼
┌────────────────────────────────────────────────────────┐
│              Analyzer Loop  (N = 3 / 4 / 5 passes)    │
│                                                        │
│  ┌──────────────────────────────────────────────────┐  │
│  │  Run Analyzer  (HTTP → Docker microservice)      │  │
│  │  Malware · Steg · Recon · Web · Macro                 │  │
│  └───────────────┬──────────────────────────────────┘  │
│                  │ findings + raw_output                │
│  ┌───────────────▼──────────────────────────────────┐  │
│  │  AI Routing Engine  (Groq qwen/qwen3-32b)        │  │
│  │  IOC extraction → next_tool + target             │  │
│  └───────────────┬──────────────────────────────────┘  │
│                  │                                      │
│          ┌───────┴──────────────────┐                  │
│       next tool                  null                   │
│          │                          │                   │
│          │               Download HTTP payloads         │
│          │               from raw_output → re-analyze  │
│          └──────────────── repeat ────────────────────┘│
└─────────────────┬──────────────────────────────────────┘
                  │
                  ▼
┌────────────────────────────────┐
│  Findings Store                │  All passes · all findings accumulated
└───────────────┬────────────────┘
                │
                ▼
┌────────────────────────────────────────────┐
│  Threat Intelligence Engine                │
│  (Groq llama-3.3-70b-versatile)           │
│  ├─ Threat Summary + MITRE ATT&CK TTPs    │
│  ├─ YARA Detection Rules (2–5 rules)      │
│  └─ SIGMA SIEM Rules (2–4 rules)          │
└───────────────┬────────────────────────────┘
                │
                ▼
┌────────────────────────────────┐
│  PWNDoc HTML Report            │  Groq summary → browser-rendered HTML
│                                │  One-click Export PDF button
└────────────────────────────────┘

Quick Start

Prerequisites

Docker + Docker Compose
API keys for Groq and VirusTotal (free tiers work)

1. Clone the repository

git clone https://github.com/aradhyacp/SecFlow.git
cd SecFlow/backend

2. Configure environment variables

cp .env.example .env

Edit .env with your keys:

# Required
GROQ_API_KEY=your_groq_api_key_here
VIRUSTOTAL_API_KEY=your_vt_api_key_here

# Optional — unlock additional OSINT capabilities
NUMVERIFY_API_KEY=your_numverify_key      # Phone number lookups
THREATFOX_API_KEY=your_threatfox_key      # Higher ThreatFox rate limits
ipAPI_KEY=your_ipapi_key                  # Higher ip-api.com rate limits

# Pipeline control
MAX_PASSES=3                              # 3 | 4 | 5

3. Start all services

docker compose up -d

This starts 6 containers:

Service	Port	Role
`orchestrator`	`5000`	Pipeline controller — main entry point
`malware-analyzer`	`5001`	Ghidra decompilation + VirusTotal
`steg-analyzer`	`5002`	binwalk + zsteg + steghide + ExifTool
`recon-analyzer`	`5003`	ip-api + ThreatFox + OSINT
`web-analyzer`	`5005`	HTTP vuln scanner + header audit
`macro-analyzer`	`5006`	oletools (olevba) + VirusTotal

Note: First start may take several minutes — the Malware Analyzer downloads Ghidra 12.0.1 (~500 MB) and requires a JDK 21 JVM.

4. Run your first analysis

Analyze a file:

curl -X POST http://localhost:5000/api/smart-analyze \
  -F "file=@/path/to/suspicious.exe" \
  -F "passes=3"

Analyze a URL, IP, or domain:

curl -X POST http://localhost:5000/api/smart-analyze \
  -H "Content-Type: application/json" \
  -d '{"target": "192.168.1.100", "passes": 3}'

Response:

{
  "job_id": "a1b2c3d4",
  "findings": [...],
  "report_paths": {
    "json": "/api/report/a1b2c3d4/json",
    "html": "/api/report/a1b2c3d4/html"
  }
}

Open http://localhost:5000/api/report/<job_id>/html in your browser to view the full report and export to PDF.

5. Launch the frontend

cd ../frontend
npm install
npm run dev

Open http://localhost:5173 — the React dashboard lets you submit analyses, watch pipeline progress in real time, and browse results per analyzer.

Analyzers

Malware Analyzer — Port 5001

Analyzes executables and binaries with a three-layer approach:

Ghidra 12.0.1 (via pyghidra) — full decompilation of all functions to C pseudo-code
objdump -d — assembly-level disassembly
VirusTotal API v3 — 70+ AV engine detections, behavioral tags, file reputation

Supported: exe, dll, so, elf, bin, o, out · Max 50 MB · Requires 4 GB RAM (Ghidra JVM)

Steganography Analyzer — Port 5002

Detects hidden data embedded in images using multiple methods:

binwalk — detects and extracts embedded files at binary offsets
foremost — file carving from raw binary streams
zsteg — LSB steganography detection in PNG/BMP
steghide — passphrase-based steg detection in JPEG/BMP
ExifTool — metadata extraction and anomaly detection

Extracts embedded archives and queues them for re-analysis in the next pipeline pass.

Supported: PNG, JPG, BMP, GIF, TIFF, WebP

Reconnaissance Analyzer — Port 5003

Performs threat intelligence and OSINT on network identifiers:

Scan mode (IP / domain):

Module	Source	What it checks
`ipapi`	ip-api.com	Country, ISP, ASN, geolocation
`talos`	Cisco Talos blocklist	IP reputation / blacklist
`tor`	Tor Project exit list	Tor exit node detection
`tranco`	Tranco ranking list	Domain popularity rank
`threatfox`	abuse.ch ThreatFox	Active IOC / malware association

Footprint mode (email / phone / username):

Email — XposedOrNot breach database (breach count, severity, password risk)
Phone — NumVerify carrier + country + line type validation
Username — Sagemode multithreaded profile discovery across social platforms

Web Vulnerability Analyzer — Port 5005

Audits URLs and web endpoints:

Security header analysis (CSP, HSTS, X-Frame-Options, etc.)
Technology fingerprinting (server, frameworks, CMS)
HTTP response analysis and redirect chain following
Basic vulnerability scanning for common misconfigurations

Macro / Office Analyzer — Port 5006

Dissects Office documents for malicious macros:

oletools (olevba) — extracts and decompiles VBA/XLM macros
AutoExec detection — flags macros that auto-run on open/close
IOC extraction — URLs, IPs, file paths embedded in macro code
Obfuscation detection — Base64, Chr() chains, hex encoding
VirusTotal API v3 — file reputation cross-check

Supported: doc, docx, docm, xls, xlsx, xlsm, xlsb, ppt, pptx, pptm, rtf

Report Output

Every pipeline run produces two report formats saved to backend/reports/<job_id>/:

HTML Report (`report.html`)

Open in any browser. Click Export PDF to print — no server-side PDF rendering needed, no dependencies.

Contains: executive summary · YARA rules · SIGMA rules · MITRE TTPs · per-pass evidence panels · VirusTotal engine badges.

JSON Report (`report.json`)

Fully structured machine-readable output. Use this when you want to:

Feed findings directly into another AI model for deeper analysis
Ingest into a SIEM or ticketing system
Diff two reports programmatically
Build custom dashboards

The JSON mirrors the HTML exactly — every finding, YARA rule, SIGMA rule, IOC, and TTP is present in a clean, typed schema.

See examples/ for sample input files and example_reports for real report outputs generated during development.

Executive Summary

AI-written narrative (Groq qwen/qwen3-32b) covering:

Identified threat name and actor type classification
Attack chain reconstruction (step-by-step)
Confidence rating and overall risk score

YARA Detection Rules

2–5 production-ready YARA rules generated by llama-3.3-70b-versatile, each:

Named with SecFlow_[ThreatCategory]_[IndicatorType] convention
Containing valid YARA 4.x syntax — ready to import into any YARA-compatible scanner
Including a reasoning field citing the exact evidence from the analysis that informed the rule
Covering distinct aspects: file signatures, embedded strings, C2 indicators, packer signatures, memory patterns

rule SecFlow_Trojan_C2StringIndicator {
  meta:
    description = "Detects C2 callback string found in Ghidra decompilation"
    author      = "SecFlow AI"
    severity    = "high"
  strings:
    $c2 = "evil.sh/drop.exe"
    $ua = "Mozilla/4.0 (compatible; MSIE 6.0)"
  condition:
    any of them
}

SIGMA SIEM Rules

2–4 SIGMA rules for immediate SIEM deployment, each:

Covering a different log source (process creation, network, DNS, file events, registry)
Including valid SIGMA syntax compatible with sigma-cli 0.x and pySigma
Mapping tags to real MITRE ATT&CK tactics and technique IDs
Importable into Splunk, Elastic Security, Microsoft Sentinel, Chronicle, QRadar

title: Detect Suspicious AutoExec Macro Execution
id: f2a3b1c4-...
status: experimental
logsource:
  category: process_creation
  product: windows
detection:
  selection:
    CommandLine|contains:
      - 'EXCEL.EXE'
      - '/automation'
  condition: selection
level: high
tags:
  - attack.execution
  - attack.t1137.001

MITRE ATT&CK TTPs

Every identified behavior mapped to real technique IDs with tactic names and reasoning.

Per-Pass Evidence

Collapsible panels for each analyzer pass showing:

Ghidra decompilation output (dark code block, collapsible)
objdump disassembly (collapsible)
VirusTotal engine detections (color-coded severity badges)
Raw analyzer findings JSON

Export PDF

One-click browser print dialog pre-configured for PDF export — no server-side PDF generation needed.

Example Pipeline Runs

Sample input files are in examples/ — includes real malware samples (RealMalware.exe, ColorBug.exe, EarlyEnd.exe, .out ELF binaries) and a malicious Office document (nuclear_motor_example.docm). Corresponding report outputs are in backend/reports/.

Malicious Office Document

Input:  invoice.xlsm
Passes: 3

Pass 1 ─ Rule: .xlsm extension → Macro Analyzer
          olevba: AutoExec macro found
          IOC: http://evil.sh/drop.exe
          VT: 12/70 engines flagged

Pass 2 ─ AI: URL found in IOCs → Web Analyzer
          http://evil.sh/drop.exe — alive, 302 redirect to CDN

Pass 3 ─ AI: no further tool, but HTTP URL in raw_output
          Download: drop.exe → Malware Analyzer
          Ghidra: C2 callback string, packed PE
          VT: 45/70 detections — Trojan.GenericKDZ

Report  ─ PWNDoc HTML generated
          YARA: 4 rules (string, byte sig, packer, C2 domain)
          SIGMA: 3 rules (process_creation, network, registry)
          MITRE: T1566.001, T1059.005, T1071.001

Suspicious Image with Embedded Payload

Input:  profile.png
Passes: 3

Pass 1 ─ Rule: image/png → Steg Analyzer
          binwalk: embedded ELF binary at offset 0x8200
          Archive extracted → queued for re-analysis

Pass 2 ─ Queue: extracted ELF → Malware Analyzer
          Ghidra: C2 callout to 192.168.1.100
          objdump: packed UPX section

Pass 3 ─ AI: IP found → Recon Analyzer
          Talos: blacklisted
          Tor: confirmed exit node
          ThreatFox: associated with AsyncRAT

Report  ─ Full chain documented
          YARA: 3 rules (ELF magic, UPX sig, C2 string)
          SIGMA: 2 rules (network_connection, dns_query)

Suspicious Domain

Input:  malicious-domain.ru
Passes: 3

Pass 1 ─ Rule: domain regex → Recon Analyzer
          ipapi: RU, ISP: HostMaster LLC
          Talos: on blocklist
          ThreatFox: linked to Raccoon Stealer, confidence 95

Pass 2 ─ AI: ThreatFox hit → Web Analyzer
          /login endpoint returns 200, harvesting form detected

Pass 3 ─ AI: no futher signals — loop exits early

Report  ─ Executive summary + TTPs + SIGMA network rules

Project Structure

SecFlow/
├── backend/
│   ├── compose.yml                 # All 6 services on secflow-net
│   ├── .env.example                # All required + optional API keys
│   │
│   ├── orchestrator/               # Pipeline controller (port 5000)
│   │   ├── app/
│   │   │   ├── routes.py           # POST /api/smart-analyze
│   │   │   ├── orchestrator.py     # Pipeline loop + download-and-analyze
│   │   │   ├── classifier/
│   │   │   │   ├── classifier.py   # file + python-magic type detection
│   │   │   │   └── rules.py        # Deterministic routing rules
│   │   │   ├── ai/
│   │           ├── engine.py       # Groq qwen/qwen3-32b routing decisions
│   │           ├── threat_intel.py # YARA rules + SIGMA rules + threat summary
│   │   │   │   └── keywords.txt    # Grep fallback keyword list
│   │   │   ├── adapters/           # Translate analyzer responses → contract
│   │   │   │   ├── malware_adapter.py
│   │   │   │   ├── steg_adapter.py
│   │   │   │   ├── recon_adapter.py
│   │   │   │   ├── web_adapter.py
│   │   │   │   └── macro_adapter.py
│   │   │   ├── store/
│   │   │   │   └── findings_store.py   # Thread-safe findings accumulator
│   │   │   └── reporter/
│   │   │       └── report_generator.py # PWNDoc HTML + Export PDF
│   │   ├── Dockerfile
│   │   └── requirements.txt
│   │
│   ├── Malware-Analyzer/           # Ghidra + objdump + VirusTotal (port 5001)
│   ├── Steg-Analyzer/              # binwalk + zsteg + steghide (port 5002)
│   ├── Recon-Analyzer/             # ip-api + ThreatFox + OSINT (port 5003)
│   ├── Web-Analyzer/               # HTTP vuln scanner (port 5005)
│   └── macro-analyzer/             # oletools + VirusTotal (port 5006)
│
├── frontend/                       # React + Vite dashboard (port 5173)
│   └── src/
│       ├── pages/dashboard/        # Per-analyzer pages + smart pipeline UI
│       ├── components/             # Reusable UI components
│       └── pages/LandingPage.jsx   # Public landing page
│
├── examples/                       # Sample input files for testing
│   ├── RealMalware.exe             # Real malware sample
│   ├── ColorBug.exe / EarlyEnd.exe # PE test samples
│   ├── sample.out / sample2.out    # ELF binaries
│   └── nuclear_motor_example.docm  # Malicious Office document
│
├── docs/                           # Architecture + pipeline + analyzer docs
├── AGENTS.md                       # Agent architecture + coding conventions
└── Readme.md

AI Models

SecFlow uses Groq for all AI inference — free tier, no credit card required.

Role	Model	Why
Pipeline routing	`qwen/qwen3-32b`	Reliable structured JSON output; `/no_think` mode skips chain-of-thought for fast routing decisions
Threat intelligence	`llama-3.3-70b-versatile`	Stronger reasoning for YARA/SIGMA generation and MITRE TTP mapping
Report summary	`qwen/qwen3-32b`	Executive summary + recommendations

SecFlow uses the OpenAI-compatible API spec via the standard openai Python SDK — no vendor-specific SDK required. This means you can swap in any OpenAI-compatible model provider (OpenAI, Groq, Together, Ollama, etc.) by changing just the base_url and model name:

from openai import OpenAI

# Groq (current — free tier)
client = OpenAI(api_key=GROQ_API_KEY, base_url="https://api.groq.com/openai/v1")

# OpenAI (drop-in swap)
client = OpenAI(api_key=OPENAI_API_KEY)  # base_url defaults to api.openai.com

# Local Ollama (fully offline)
client = OpenAI(api_key="ollama", base_url="http://localhost:11434/v1")

Why Groq + free tier? SecFlow was built to be accessible — no paid API required to run the full pipeline. Groq's free tier covers all routing and report generation with no cost. If you're running heavier workloads or want to sponsor the project, see the GitHub Sponsors page.

API Reference

All requests go to the orchestrator at http://localhost:5000.

`POST /api/smart-analyze`

Submit a file or target for analysis.

File input:

curl -X POST http://localhost:5000/api/smart-analyze \
  -F "file=@sample.exe" \
  -F "passes=4"

Target input (URL / IP / domain):

curl -X POST http://localhost:5000/api/smart-analyze \
  -H "Content-Type: application/json" \
  -d '{"target": "https://suspicious-site.com", "passes": 5}'

`GET /api/report/<job_id>/html`

Returns the full PWNDoc HTML report — open in browser, click Export PDF to save.

`GET /api/report/<job_id>/json`

Returns the raw findings JSON for programmatic consumption.

`GET /api/health`

Health check — returns {"status": "healthy"}.

Environment Variables

Variable	Service	Required	Description
`GROQ_API_KEY`	orchestrator	✅	AI routing + threat intel + report generation
`VIRUSTOTAL_API_KEY`	malware, macro	✅	VirusTotal API v3 file/URL analysis
`NUMVERIFY_API_KEY`	recon	Optional	Phone number validation (NumVerify)
`THREATFOX_API_KEY`	recon	Optional	Higher rate limit on ThreatFox IOC queries
`ipAPI_KEY`	recon	Optional	Higher rate limit on ip-api.com
`MAX_PASSES`	orchestrator	Optional	Loop depth — `3` (default) / `4` / `5`

Status

Component	Status
Orchestrator + Classifier + AI Engine	✅ Complete
Malware Analyzer (Ghidra + VirusTotal)	✅ Complete
Steg Analyzer (binwalk + zsteg + steghide)	✅ Complete
Recon Analyzer (ip-api + ThreatFox + OSINT)	✅ Complete
Web Vulnerability Analyzer	✅ Complete
Macro Analyzer (oletools + VirusTotal)	✅ Complete
Download-and-Analyze payload fallback	✅ Complete
YARA Rule Auto-Generation	✅ Complete
SIGMA Rule Auto-Generation	✅ Complete
MITRE ATT&CK TTP Mapping	✅ Complete
HTML Report + JSON Report + Export PDF	✅ Complete
React Frontend Dashboard	✅ Complete

Documentation

Document	Description
AGENTS.md	Agent architecture, service contracts, and AI coding instructions
ProjectDetails.md	Full project specification and design decisions
docs/architecture.md	System component and data-flow diagrams
docs/pipeline-flow.md	Detailed pipeline loop logic and decision tree
docs/analyzers.md	Per-analyzer capability and interface spec
docs/migration.md	Integration guide for the analyzer microservices
backend/Readme.md	Backend setup, development, and troubleshooting guide

Contributing

Contributions are welcome. SecFlow is open source and actively maintained.

Fork aradhyacp/SecFlow
Create a feature branch: git checkout -b feat/your-feature
Make changes — see AGENTS.md for architecture conventions and service contracts
Open a pull request

Good first issues: New IOC extraction patterns, SIGMA rule improvements, additional OSINT modules, frontend analyzer pages, report export improvements.

If SecFlow is useful to your work or research, consider sponsoring the project — it helps keep the free-tier infrastructure and development going.

License

MIT License — see LICENSE for details.

Built for security analysts who need answers, not more tools to juggle.

If SecFlow helps you, give it a star — it helps others discover the project.

github.com/aradhyacp/SecFlow

#cybersecurity #threatintelligence #malwareanalysis #yara #sigma #soc #dfir #infosec #osint #reverseengineering #steganography #virustotal #ghidra #docker #python #openSource #automation #mitre #attackframework #secops #blueTeam #incidentResponse #siem #edr #ioc #pwndoc #groq #llm #aiSecurity

Name		Name	Last commit message	Last commit date
Latest commit History 181 Commits
backend		backend
docs		docs
example_reports		example_reports
examples		examples
frontend		frontend
.gitignore		.gitignore
AGENTS.md		AGENTS.md
LICENSE		LICENSE
ProjectDetails.md		ProjectDetails.md
Readme.md		Readme.md
SecFlowSocialCard.jpg		SecFlowSocialCard.jpg

Folders and files

Latest commit

History

Repository files navigation

SecFlow

Fully Automated Multi-Vector Threat Analysis Pipeline

What is SecFlow?

Features

Architecture

Quick Start

Prerequisites

1. Clone the repository

2. Configure environment variables

3. Start all services

4. Run your first analysis

5. Launch the frontend

Analyzers

Malware Analyzer — Port 5001

Steganography Analyzer — Port 5002

Reconnaissance Analyzer — Port 5003

Web Vulnerability Analyzer — Port 5005

Macro / Office Analyzer — Port 5006

Report Output

HTML Report (report.html)

JSON Report (report.json)

Executive Summary

YARA Detection Rules

SIGMA SIEM Rules

MITRE ATT&CK TTPs

Per-Pass Evidence

Export PDF

Example Pipeline Runs

Malicious Office Document

Suspicious Image with Embedded Payload

Suspicious Domain

Project Structure

AI Models

API Reference

POST /api/smart-analyze

GET /api/report/<job_id>/html

GET /api/report/<job_id>/json

GET /api/health

Environment Variables

Status

Documentation

Contributing

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

HTML Report (`report.html`)

JSON Report (`report.json`)

`POST /api/smart-analyze`

`GET /api/report/<job_id>/html`

`GET /api/report/<job_id>/json`

`GET /api/health`

Packages