⚡ TIMTEH Model Forge

Uncensored, abliterated & reasoning-distilled GGUFs — forged on 8×H200 SXM5 | 1.1TB VRAM

What We Do

We take frontier open-weight models and make them actually useful:

🔓 Abliteration — Remove refusal training without destroying capability
🧠 Reasoning Distillation — Inject Claude Opus-level reasoning into open models via SFT
📦 Full GGUF Quant Ladder — Q2_K through BF16, every size for every setup
🔥 Real Benchmarks — Not vibes, not "it feels smarter." Numbers.

Hardware

All models are processed on bare metal:

Component	Spec
GPUs	8× NVIDIA H200 SXM5 (141GB each)
Total VRAM	1.1TB
Storage	35TB NVMe RAID0
Inference	Full BF16, no compromises

We don't rent A100s for 4 hours and pray. We run the full pipeline on dedicated iron.

Released Models

Model	Size	Type	Downloads	Link
Mistral-Small-4-119B-Uncensored-GGUF	119B	Abliterated, 7 quants	🆕 NEW	HuggingFace

Coming Soon

Model	Type	ETA
Qwen3.5-122B-A10B-Claude-Opus-Reasoning-Uncensored-GGUF	Reasoning distilled + abliterated	Training now (~18h)
Nemotron-3-Super-120B-Uncensored-GGUF	Abliterated	Next

Philosophy

Raw Power Doctrine

We have 1.1TB VRAM. We don't use optimization hacks designed for VRAM-poor setups. Raw transformers + peft + trl. Native SDPA. Direct llama.cpp. Every abstraction layer is a failure point we don't need.

No-Bullshit Engineering

Broken dependency? Diagnose once, route around it, ship. We don't spend 5 attempts patching someone else's bug when the lower-level tool works fine.

Ship Fast, Ship Clean

Corrupt output = fix or rebuild. Never ship broken artifacts. Every GGUF is magic-byte verified before upload.

Pipeline

Scout trending model on HF
  → Download full weights to /data (35TB NVMe)
  → Abliterate on 8×H200 (full BF16, extract+remove refusal direction)
  → Optional: SFT with reasoning datasets (Claude Opus distillation)
  → Convert to GGUF (llama.cpp convert_hf_to_gguf.py)
  → Quantize full ladder: Q2_K, Q3_K_M, Q4_K_M, Q5_K_M, Q6_K, Q8_0, BF16
  → Verify all artifacts (magic bytes, tensor counts)
  → Benchmark (llama-bench, perplexity where arch supports it)
  → Upload to HuggingFace + GitHub release
  → Post to r/LocalLLaMA

Support

If our models are useful to you:

☕ Buy Me A Coffee — every coffee funds more uncensored models

License

Individual model licenses follow their base model's license (Apache 2.0, Llama Community, etc). Pipeline code in this repo is MIT.

⚡ Forged on 8×H200 SXM5 | 1.1TB VRAM

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
scripts		scripts
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

⚡ TIMTEH Model Forge

What We Do

Hardware

Released Models

Coming Soon

Philosophy

Raw Power Doctrine

No-Bullshit Engineering

Ship Fast, Ship Clean

Pipeline

Support

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

⚡ TIMTEH Model Forge

What We Do

Hardware

Released Models

Coming Soon

Philosophy

Raw Power Doctrine

No-Bullshit Engineering

Ship Fast, Ship Clean

Pipeline

Support

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages