Nostr npub vanity address miner with GPU acceleration (CUDA).
5.8 billion keys/sec on RTX 5070 Ti - find an 8-character prefix in ~3.5 minutes!
- GPU-accelerated mining using CUDA
- Multiple prefix search (OR matching)
- Optimized secp256k1 implementation with endomorphism
- PTX-level optimizations for maximum performance
- Triple buffering for 100% GPU utilization
- NVIDIA GPU (Compute Capability 7.5+)
- CUDA Toolkit 12.x or 13.x
- Windows or Linux (WSL supported for building)
# Clone the repository
git clone https://github.com/high-moctane/mocnpub.git
cd mocnpub
# Build (requires CUDA Toolkit for GPU support)
cargo build --releaseYou can customize the build with environment variables:
# Custom keys per thread (default: 1600)
MAX_KEYS_PER_THREAD=2048 cargo build --release./target/release/mocnpub-main mine --prefix m0ctane| Option | Description | Default |
|---|---|---|
--prefix <PREFIX> |
Prefix to search for (required) | - |
--limit <N> |
Number of keys to find (0 = unlimited) | 1 |
--output <FILE> |
Output file (optional) | stdout |
--batch-size <N> |
GPU batch size | 4000000 |
--threads-per-block <N> |
GPU threads per block | 128 |
--miners <N> |
Number of parallel miners | 2 |
Search for multiple prefixes at once (OR matching):
./target/release/mocnpub-main mine --prefix m0ctane,sakura,n0str# Find a 4-character prefix (fast, < 1 second)
./target/release/mocnpub-main mine --prefix 0000
# Find an 8-character prefix (~3.5 minutes)
./target/release/mocnpub-main mine --prefix m0ctane0
# Find multiple keys
./target/release/mocnpub-main mine --prefix m0c --limit 5
# Save to file
./target/release/mocnpub-main mine --prefix test --output keys.txtBenchmarked on RTX 5070 Ti (16GB VRAM):
5.9 billion keys/sec (84,935x faster than CPU baseline)
| Prefix Length | Combinations | Expected Time |
|---|---|---|
| 4 chars | ~1M | < 1 sec |
| 6 chars | ~1B | < 1 sec |
| 8 chars | ~1T | ~3.5 min |
| 10 chars | ~1P | ~2.5 days |
Note: bech32 uses 32 characters (excluding 1, b, i, o), so each character adds ~5 bits of entropy.
- Endomorphism: 2.9x coverage using secp256k1's special properties
- Montgomery's Trick: ~85x reduction in modular inversions
- Sequential key strategy: PointAdd instead of full scalar multiplication
- Addition Chain: 114 multiplications eliminated in ModInv
- PTX inline assembly: Hand-tuned carry/borrow chains
- Triple buffering: 100% GPU utilization
- Pure Rust with inline CUDA (PTX)
- No external secp256k1 library dependency on GPU
- Custom 256-bit modular arithmetic with PTX optimizations
Ensure CUDA Toolkit is installed and nvcc is in your PATH:
# Check CUDA installation
nvcc --version
# Set CUDA_PATH if needed
export CUDA_PATH=/usr/local/cudaOn NixOS, the CUDA Toolkit is split across multiple store paths. nvcc can't find
cuda_runtime.h because it resolves symlinks and looks relative to its real binary path
(cuda_nvcc), not the merged package (cuda-merged).
Fix: Set CUDA_PATH to the cudatoolkit store path. build.rs will pass -I$CUDA_PATH/include to nvcc.
# configuration.nix
environment.systemPackages = with pkgs; [
cudaPackages_13_1.cudatoolkit
];
# nvcc needs this to find cuda_runtime.h (NixOS symlink issue)
environment.variables.CUDA_PATH = "${pkgs.cudaPackages_13_1.cudatoolkit}";For NixOS WSL, also add the Windows driver library path:
# fish shellInit
set -gx LD_LIBRARY_PATH /usr/lib/wsl/lib $LD_LIBRARY_PATHRebuild with smaller MAX_KEYS_PER_THREAD:
MAX_KEYS_PER_THREAD=800 cargo build --releaseFor best performance, run the compiled binary on Windows native, not in WSL.
Build in WSL, then copy to Windows or use git pull on Windows.
For detailed information about the development journey:
- 何も知らないところから始めた6週間 — mocnpub 開発記 - Blog post by Sakura 🌸
- JOURNEY.md - The complete story of building mocnpub
- OPTIMIZATION.md - Technical deep-dive into all 35 optimization steps
- LEARNING.md - Learning path from beginner to PTX mastery
- CODE_REVIEW.md - Code review by Claude (Web) 🌸
This project was developed through pair programming with Claude Code 🌸
MIT