Skip to content

kariemoorman/ghostbit

Repository files navigation

img

GH0STB1T:
A MULTI-FORMAT STEGANOGRAPHY TOOLKIT

License Release Language Code Size

Tests Coverage


Why?

Architectural Modernization

This implementation represents a complete architectural migration from platform-dependent Java and .NET codebases to a unified Python solution, delivering:

  • Platform Independence: Eliminates platform-specific runtime dependencies, ensuring portability across heterogeneous computing environments (Windows, macOS, and Linux).

  • Memory Efficiency: Eliminates JVM heap overhead, reducing baseline memory consumption and enabling efficient operation on resource-constrained systems while maintaining full functionality.

  • Auditability: Eliminates reliance on platform-specific cryptographic APIs or closed-source runtime components in favor of open-source Python cryptography, enabling independent security audits and transparent verification of the implementation.

  • Type Safety: This implementation targets Python 3.13+ to leverage modern type annotations and static analysis capabilities, ensuring type safety across the entire codebase.

These improvements reduce deployment complexity and computational overhead, facilitating reliable and efficient operation in resource-constrained environments for both human operators and automated LLM-driven workflows.

(see also EFF Coders' Rights Project Reverse Engineering FAQ)

Security Upgrade

Longstanding steganography tools (e.g., OpenStego, DeepSound, SilentEye) use outdated cryptographic primitives that leave hidden data vulnerable to attack. These tools rely on weak key derivative functions (KDFs) such as direct hashing of passwords using MD5, SHA-1, or SHA-256, which provide no brute-force resistance. They also rely on legacy encryption modes such as AES-CBC without authentication. These modes provide confidentiality only, leaving payloads vulnerable to undetected modification, bit-flipping attacks, and—under common error-handling patterns—padding decryption.

This implementation pairs existing steganography protocols with modern, audited cryptographic standards to ensure secure information hiding:

Component Algorithm Parameters Security Properties
Key Derivation Argon2id 64MB memory, 3 iterations, parallelism=4 - Memory-hard function
- Hybrid protection against side-channel attacks
Encryption AES-256-GCM 96-bit random nonce, 128-bit auth tag - Authenticated Encryption with Associated Data (AEAD)
- Confidentiality + Integrity + Authenticity in single operation
Salt Random 128-bit, unique per file - Prevention against rainbow table attacks
- Unique keys even with identical passwords
What Does This Mean?

NIST/FIPS Compliant Cryptography

  • Transition to algorithms approved by national security standards (Argon2id, AES-256-GCM); the same cryptographic primitives used in TLS 1.3, Signal, Bitwarden, and enterprise security systems.

Uncrackable Passwords

  • Legacy SHA-256 allows attackers to test billions of passwords per second on modern GPUs. Argon2id slows attackers to only thousands of tests per second, making brute-force attacks impractical (memory-hard by design). Even weak passwords (8 characters) gain years of protection against brute-force attacks.

Tamper Detection & Integrity Verification

  • Legacy AES-CBC allows for undetected tampering, bit-flipping attacks, and payload manipulation. AES-GCM cryptographically authenticates every byte of hidden data. Any modification (e.g., a single bit flip) causes immediate decryption failure. It is now mathematically impossible to alter data without detection.

Elimination of Padding Oracle Vulnerabilities

  • Legacy AES-CBC with PKCS#7 padding is vulnerable to adaptive chosen-ciphertext attacks. Attackers can decrypt data without the password by observing error messages. AES-GCM uses authenticated encryption: no padding, no oracle, constant-time failure.


Features

Multimedia Steganography: Multi-format support across audio, images, and video
• Audio: WAV / MP3 / FLAC / M4A / AIFF
• Image: BMP / PNG / JPEG / WEBP / TIFF / SVG / GIF
• Video: MP4 / MKV / MOV / AVI

Strong Encryption: AES-GCM with Argon2id key derivation for embedded files (see Security Upgrade)

CLI: Easy-to-use command line interface (see CLI)

API: Project integration via API (see API)

Docker: Containerized deployment support (see Docker)

LLM Integration: Built-in skills system for LLM-driven workflows (see LLM Integration)

Cross-Platform Compatibility: MacOS, Linux, Windows

Installation

Requirements

  • Python 3.13+
  • FFmpeg (for audio format conversion)

GitHub Release

Download the latest .whl file from Releases:

pip install git+https://github.com/kariemoorman/ghostbit.git@latest

Development Build

Install from source for development or to access the latest features:

git clone https://github.com/kariemoorman/ghostbit.git
cd ghostbit
pip install -e ".[dev]"

Usage

💻 CLI

GH0STB1T CLI provides quick encoding/decoding/analysis operations directly from the terminal.

Encode (Hide files)
# Audio
ghostbit audio encode -i <audio_filepath> -s <secret_filepath> <secret_filepath> -q {low,normal,high} -o <output_filename>.<desired_format> -p

# Image
ghostbit image encode -i <image_filepath> -s <secret_filepath> <secret_filepath> -p
Calculate Carrier Capacity
# Audio
ghostbit audio capacity -i <audio_filepath> -q {low,normal,high}

# Image
ghostbit image capacity -i <image_filepath> 
Decode (Extract Files)
# Audio
ghostbit audio decode -i <audio_filepath> -p

# Image
ghostbit image decode -i <image_filepath> -p
Analyze File
# Audio
ghostbit audio analyze -i <audio_filepath>

# Image
ghostbit image analyze -i <image_filepath>
Create Test Files
# Audio Creation for Testing
ghostbit audio test -o test_audio

# Image Creation for Testing
ghostbit image test -o test_images

Python API

GH0STB1T provides a Python API for seamless integration into existing applications and workflows.

Encode (Hide files)
from ghostbit.audiostego import AudioMultiFormatCoder, EncodeMode

# Initialize coder
coder = AudioMultiFormatCoder()

# Encode files
coder.encode_files_multi_format(
    carrier_file="music.wav",
    secret_files=["document.pdf", "image.jpg"],
    output_file="output.wav",
    quality_mode=EncodeMode.NORMAL_QUALITY,
    password="optional_password"
)
# Encoding with Progress Callbacks
from ghostbit.audiostego import AudioMultiFormatCoder

coder = AudioMultiFormatCoder()

# Encoding progress
def on_encode_progress():
    print(".", end="", flush=True)

coder.on_encoded_element = on_encode_progress

coder.encode_files_multi_format(
    carrier_file="carrier.wav",
    secret_files=["secret.pdf"],
    output_file="output.wav"
)
Calculate Carrier Capacity
from ghostbit.audiostego import AudioMultiFormatCoder, BaseFileInfoItem, EncodeMode

coder = AudioMultiFormatCoder()
wav_file = coder._convert_to_wav("carrier_file.flac")

def get_capacity(wav_file, encode_mode):

    base_file = BaseFileInfoItem(
        full_path=wav_file,
        encode_mode=encode_mode,
        wav_head_length=44,
    )
    return base_file.max_inner_files_size

capacity_bytes = get_capacity(wav_file, EncodeMode.NORMAL_QUALITY)

print(f"Maximum capacity: {capacity_bytes / (1024*1024):.2f} MB")
from ghostbit.audiostego import AudioMultiFormatCoder, EncodeMode
import os

coder = AudioMultiFormatCoder()

# Check capacity with different quality modes
carrier = "long_audio.wav"
secret_file = "large_video.mp4"
secret_size = os.path.getsize(secret_file) / (1024 * 1024)

print(f"Secret file size: {secret_size:.2f} MB")

for mode in [EncodeMode.LOW_QUALITY, EncodeMode.NORMAL_QUALITY, EncodeMode.HIGH_QUALITY]:
    capacity = get_capacity(carrier, mode) / (1024 * 1024)
    fits = "✅ FITS" if capacity >= secret_size else "❌ TOO LARGE"
    print(f"{mode.name}: {capacity:.2f} MB capacity - {fits}")
Decode (Extract Files)
from ghostbit.audiostego import AudioMultiFormatCoder

# Initialize coder
coder = AudioMultiFormatCoder()

# Decode files
coder.decode_files_multi_format(
    encoded_file="output.wav",
    output_dir="extracted/",
    password="optional_password"
)
# Decode with Progress Callbacks
from ghostbit.audiostego import AudioMultiFormatCoder

coder = AudioMultiFormatCoder()

# Decoding progress
def on_decode_progress():
    print(".", end="", flush=True)

coder.on_decoded_element = on_decode_progress

coder.decode_files_multi_format(
    encoded_file="output.wav",
    output_dir="extracted/"
)
Password Protection
from ghostbit.audiostego import AudioMultiFormatCoder, KeyRequiredEventArgs

coder = AudioMultiFormatCoder()

# Handle password requests during decoding
def request_password(args: KeyRequiredEventArgs):
    password = input(f"Enter password (version {args.h22_version}): ")
    if password:
        args.key = password
    else:
        args.cancel = True  # Cancel operation

coder.on_key_required = request_password

coder.decode_files_multi_format(
    encoded_file="encrypted_output.wav",
    output_dir="extracted/"
)
# Password-Protected Multiple Files
from ghostbit.audiostego import AudioMultiFormatCoder, EncodeMode

coder = AudioMultiFormatCoder()

# Encode with password
coder.encode_files_multi_format(
    carrier_file="music.mp3",
    secret_files=[
        "report.pdf",
        "spreadsheet.xlsx",
        "presentation.pptx"
    ],
    output_file="encoded_music.mp3",
    quality_mode=EncodeMode.HIGH_QUALITY,
    password="SuperSecure123!"
)

print("✅ Multiple files encrypted and hidden!")

# Decode
coder.decode_files_multi_format(
    encoded_file="encoded_music.mp3",
    output_dir="extracted_files/",
    password="SuperSecure123!"
)

print("✅ Files extracted successfully!")

🐳 Docker

GH0STB1T can be deployed using Docker for isolated, reproducible environments.

Initial Setup
  1. Clone the repository:
git clone https://github.com/kariemoorman/ghostbit.git
cd ghostbit
  1. Create local input and output directories:

These local directories are mapped to Docker container directories, ensuring secure file access.

mkdir input output
  1. Place your files in the input/ directory:
cp /path/to/carrier.wav input/
cp /path/to/secret.pdf input/
Build & Run
# Build and start the container
docker-compose up -d ghostbit

# Encode files
docker-compose exec ghostbit ghostbit audio encode -i input/carrier.wav -f /input/secret.pdf -o encoded.wav -p

# Decode files
docker-compose exec ghostbit ghostbit audio decode -i output/encoded.wav  -p 

# Check capacity
docker-compose exec ghostbit ghostbit audio capacity input/carrier.wav -q high

# Analyze file
docker-compose exec ghostbit ghostbit audio analyze -i output/encoded.wav -p
Cleanup
# Stop the container
docker-compose stop

# Remove the container
docker-compose down

# Remove container and images
docker-compose down --rmi all

LLM Integration

GH0STB1T includes a Skills system designed for seamless integration with LLMs and AI assistants.

Available Skills

GH0STB1T provides three specialized skill documents:

  1. Audio Steganography - Complete usage guide with examples
  2. Audio Capacity - Capacity planning and optimization strategies
  3. Audio Troubleshooting - Common issues and solutions
  4. Image Steganography - Complete usage guide with examples
  5. Image Capacity - Capacity planning and optimization strategies
  6. Image Troubleshooting - Common issues and solutions

Quick Start for LLMs

Retrieve Documentation
from ghostbit.audiostego import get_audio_llm_context

# Get complete documentation formatted for LLMs
context = get_audio_llm_context()

# Use in your LLM prompt
prompt = f"""
You are an expert in audio steganography using AudioStego.

{context}

User: How do I hide a 5MB PDF in a 10-minute WAV file with maximum security?

Please provide a complete Python example with security best practices.
"""

# Send prompt to your LLM
# response = your_llm_api(prompt)
Load Specific Skills
from ghostbit.audiostego import load_audio_skill

# Load a specific skill
stego_skill = load_audio_skill("steganography")

# Get skill content
print(stego_skill.content)

# Get examples from skill
examples = stego_skill.get_examples()
for example in examples:
    print(f"Language: {example['language']}")
    print(f"Description: {example['description']}")
    print(f"Code:\n{example['code']}\n")

# Get specific section
best_practices = stego_skill.get_section("Best Practices")
print(best_practices)
Create a Prompt Template
from ghostbit.audiostego import get_audio_llm_context

# Prepare context
skills_context = get_audio_llm_context()

# Create detailed prompt
prompt = f"""
You are an expert Python developer specializing in audio steganography.

CONTEXT:
{skills_context}

TASK:
The user wants to create a secure file hiding system for sensitive documents.

Requirements:
- Hide multiple PDF files in a single audio carrier
- Use strong encryption with user-provided passwords
- Show progress during encoding/decoding
- Handle errors gracefully
- Verify file integrity after extraction

USER QUESTION: {user_question}

Please provide:
1. Complete working code
2. Security considerations
3. Error handling strategy
4. Usage example

Format your response as:
- Code blocks with explanations
- Security notes
- Example usage
"""

# Send to LLM API
# response = llm_api.generate(prompt)
Integrate with Anthropic Claude API
import anthropic
from ghostbit.audiostego import get_audio_llm_context

client = anthropic.Anthropic(api_key="your-api-key")
context = get_audio_llm_context()

message = client.messages.create(
    model="claude-3-opus-20240229",
    max_tokens=2048,
    system=f"You are an AudioStego expert.

{context}",
    messages=[
        {"role": "user", "content": "Show me how to use AudioStego with error handling"}
    ]
)

print(message.content[0].text)
Integrate with OpenAI API
from openai import OpenAI
from ghostbit.audiostego import get_audio_llm_context

client = OpenAI(api_key="your-api-key")
context = get_audio_llm_context()

response = client.chat.completions.create(
    model="gpt-4",
    messages=[
        {"role": "system", "content": f"You are an AudioStego expert.

{context}"},
        {"role": "user", "content": "How do I encode files with maximum security?"}
    ]
)

print(response.choices[0].message.content)

Troubleshooting


Contributions

Contributions are welcome!

Here's how to get started: CONTRIBUTING.md


Citation

GH0STB1T is a free and open source education and research tool. If you use GH0STB1T in your research or project, please cite it as:

@software{audiostego2026,
  author = {Karie Moorman},
  title = {GH0STB1T: A Multi-format Steganography Toolkit for Python},
  year = {2026},
  url = {https://github.com/kariemoorman/ghostbit},
  version = {0.0.1}
}

APA Format:

Moorman, Karie. (2026). GH0STB1T: A Multi-format Steganography Toolkit for Python (Version 0.0.1) [Computer software]. https://github.com/kariemoorman/ghostbit

License

This project is licensed under the Apache License 2.0 LICENSE.