Skip to content

timteh/timteh-forge

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 

Repository files navigation

⚡ TIMTEH Model Forge

Uncensored, abliterated & reasoning-distilled GGUFs — forged on 8×H200 SXM5 | 1.1TB VRAM

Buy Me A Coffee HuggingFace


What We Do

We take frontier open-weight models and make them actually useful:

  • 🔓 Abliteration — Remove refusal training without destroying capability
  • 🧠 Reasoning Distillation — Inject Claude Opus-level reasoning into open models via SFT
  • 📦 Full GGUF Quant Ladder — Q2_K through BF16, every size for every setup
  • 🔥 Real Benchmarks — Not vibes, not "it feels smarter." Numbers.

Hardware

All models are processed on bare metal:

Component Spec
GPUs 8× NVIDIA H200 SXM5 (141GB each)
Total VRAM 1.1TB
Storage 35TB NVMe RAID0
Inference Full BF16, no compromises

We don't rent A100s for 4 hours and pray. We run the full pipeline on dedicated iron.

Released Models

Model Size Type Downloads Link
Mistral-Small-4-119B-Uncensored-GGUF 119B Abliterated, 7 quants 🆕 NEW HuggingFace

Coming Soon

Model Type ETA
Qwen3.5-122B-A10B-Claude-Opus-Reasoning-Uncensored-GGUF Reasoning distilled + abliterated Training now (~18h)
Nemotron-3-Super-120B-Uncensored-GGUF Abliterated Next

Philosophy

Raw Power Doctrine

We have 1.1TB VRAM. We don't use optimization hacks designed for VRAM-poor setups. Raw transformers + peft + trl. Native SDPA. Direct llama.cpp. Every abstraction layer is a failure point we don't need.

No-Bullshit Engineering

Broken dependency? Diagnose once, route around it, ship. We don't spend 5 attempts patching someone else's bug when the lower-level tool works fine.

Ship Fast, Ship Clean

Corrupt output = fix or rebuild. Never ship broken artifacts. Every GGUF is magic-byte verified before upload.

Pipeline

Scout trending model on HF
  → Download full weights to /data (35TB NVMe)
  → Abliterate on 8×H200 (full BF16, extract+remove refusal direction)
  → Optional: SFT with reasoning datasets (Claude Opus distillation)
  → Convert to GGUF (llama.cpp convert_hf_to_gguf.py)
  → Quantize full ladder: Q2_K, Q3_K_M, Q4_K_M, Q5_K_M, Q6_K, Q8_0, BF16
  → Verify all artifacts (magic bytes, tensor counts)
  → Benchmark (llama-bench, perplexity where arch supports it)
  → Upload to HuggingFace + GitHub release
  → Post to r/LocalLLaMA

Support

If our models are useful to you:

Buy Me A Coffee — every coffee funds more uncensored models

License

Individual model licenses follow their base model's license (Apache 2.0, Llama Community, etc). Pipeline code in this repo is MIT.


⚡ Forged on 8×H200 SXM5 | 1.1TB VRAM

About

⚡ TIMTEH Model Forge — Uncensored, abliterated & reasoning-distilled GGUFs. Forged on 8×H200 SXM5 | 1.1TB VRAM

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors