Awesome LLMOps
π The Ultimate Curated List of LLMOps Tools, Frameworks, and Resources
A comprehensive collection of the best tools, frameworks, models, and resources for Large Language Model Operations (LLMOps)
π Recently Added (January 2026)
Infrastructure & Deployment:
Skypilot - Run LLMs on any cloud with one command
Modal - Serverless platform for AI/ML workloads
Evaluation & Testing:
Ragas - Evaluation framework for RAG pipelines
PromptFoo - Test and evaluate LLM outputs
Agent Frameworks:
Phidata - Build AI assistants with memory and knowledge
Composio - Integration platform for AI agents
Monitoring & Observability:
vLLM continues to dominate high-throughput inference
LangGraph gaining traction for stateful agent workflows
Ollama becoming the go-to for local LLM deployment
DeepSeek models showing impressive cost-performance ratios
LLMOps (Large Language Model Operations) is a set of practices, tools, and workflows designed to deploy, monitor, and maintain large language models in production environments. It encompasses the entire lifecycle of LLM applications, from development and training to deployment, monitoring, and continuous improvement.
Key Components of LLMOps:
Model Development : Training, fine-tuning, and optimizing LLMs
Deployment : Serving models efficiently at scale
Monitoring : Tracking performance, costs, and quality
Prompt Management : Version control and optimization of prompts
Security : Ensuring safe and responsible AI usage
Evaluation : Testing and validating model outputs
Data Management : Handling training data and embeddings
Aspect
MLOps
LLMOps
Model Size
Typically smaller models
Very large models (billions of parameters)
Training
Full model training common
Fine-tuning and prompt engineering preferred
Deployment
Standard serving infrastructure
Specialized inference optimization required
Monitoring
Metrics-focused
Quality, safety, and cost-focused
Versioning
Model versions
Model + prompt + configuration versions
Cost
Moderate compute costs
High compute and inference costs
Latency
Milliseconds
Seconds (streaming helps)
Data
Structured/tabular data
Unstructured text, multimodal data
Model
Description
Stars
License
LLaMA
Meta's foundational large language models
Research
Mistral
High-performance open models from Mistral AI
Apache 2.0
Gemma
Google's lightweight open models
N/A
Gemma License
Qwen
Alibaba's multilingual LLM series
Apache 2.0
DeepSeek
Cost-effective open-source LLMs
MIT
Phi
Microsoft's small language models
N/A
MIT
ChatGLM
Bilingual conversational language model
Apache 2.0
Alpaca
Stanford's instruction-following model
Apache 2.0
Vicuna
Open chatbot trained by fine-tuning LLaMA
Apache 2.0
BELLE
Chinese language model based on LLaMA
Apache 2.0
Falcon
TII's high-performance open models
N/A
Apache 2.0
Bloom
Multilingual LLM from BigScience
RAIL
Model
Description
Stars
LLaVA
Large Language and Vision Assistant
MiniCPM-V
Efficient multimodal model
Qwen-VL
Vision-language model from Alibaba
Model
Description
Stars
Whisper
OpenAI's speech recognition model
Faster Whisper
Fast inference engine for Whisper
Tool
Description
Stars
vLLM
High-throughput and memory-efficient inference engine
llama.cpp
LLM inference in C/C++
TensorRT-LLM
NVIDIA's optimized inference library
LMDeploy
Toolkit for compressing and deploying LLMs
DeepSpeed-MII
Low-latency inference powered by DeepSpeed
CTranslate2
Fast inference engine for Transformer models
Cortex.cpp
Local AI API Platform
LoRAX
Multi-LoRA inference server
MInference
Speed up long-context LLM inference
ipex-llm
Accelerate LLM inference on Intel hardware
Platform
Description
Stars
Ollama
Run LLMs locally with ease
LocalAI
OpenAI-compatible API for local models
LM Studio
Desktop app for running LLMs locally
N/A
GPUStack
Manage GPU clusters for LLM inference
OpenLLM
Operating LLMs in production
Ray Serve
Scalable model serving with Ray
Framework
Description
Stars
LangChain
Framework for developing LLM applications
LlamaIndex
Data framework for LLM applications
Haystack
End-to-end NLP framework
Semantic Kernel
Microsoft's SDK for AI orchestration
Langfuse
Open-source LLM engineering platform
Neurolink
Universal AI development platform
Framework
Description
Stars
AutoGPT
Autonomous AI agent framework
CrewAI
Framework for orchestrating AI agents
AutoGen
Multi-agent conversation framework
LangGraph
Build stateful multi-actor applications
AgentMark
Type-safe Markdown-based agents
Tool
Description
Stars
Prefect
Modern workflow orchestration
Airflow
Platform to programmatically author workflows
Flyte
Kubernetes-native workflow automation
Flowise
Drag & drop UI for LLM flows
Tool
Description
Stars
Axolotl
Streamlined LLM fine-tuning
LLaMA-Factory
Unified fine-tuning framework
PEFT
Parameter-Efficient Fine-Tuning
Unsloth
2x faster LLM fine-tuning
TRL
Transformer Reinforcement Learning
LitGPT
Pretrain, fine-tune, deploy LLMs
Tool
Description
Stars
Weights & Biases
ML experiment tracking
MLflow
Open-source ML lifecycle platform
TensorBoard
TensorFlow's visualization toolkit
Aim
Easy-to-use experiment tracker
Tool
Description
Stars
Chroma
AI-native embedding database
Weaviate
Vector search engine
Qdrant
Vector similarity search engine
Milvus
Cloud-native vector database
Pinecone
Managed vector database
N/A
FAISS
Efficient similarity search library
pgvector
Vector similarity search for Postgres
LanceDB
Developer-friendly vector database
Observability & Monitoring
Tool
Description
Stars
Langfuse
Open-source LLM observability
Phoenix
AI observability & evaluation
Helicone
Open-source LLM observability
Lunary
Production toolkit for LLMs
N/A
OpenLIT
OpenTelemetry-native LLM observability
Evidently
ML and LLM observability framework
DeepEval
LLM evaluation framework
PostHog
Product analytics and feature flags
Tool
Description
Stars
DVC
Data version control
LakeFS
Git for data lakes
Pachyderm
Data versioning and pipelines
Delta Lake
Storage framework for data lakes
Optimization & Performance
Tool
Description
Stars
GitHub Copilot
AI pair programmer
N/A
Cursor
AI-first code editor
N/A
Continue
Open-source AI code assistant
Cody
AI coding assistant
N/A
Tabby
Self-hosted AI coding assistant
Tool
Description
Stars
Jupyter
Interactive computing environment
Google Colab
Free cloud notebooks
N/A
Gradient
Managed notebooks and workflows
N/A
Platform
Description
Stars
Agenta
LLMOps platform for building robust apps
Dify
LLM app development platform
Pezzo
Open-source LLMOps platform
Humanloop
Prompt management and evaluation
N/A
PromptLayer
Prompt engineering platform
N/A
Weights & Biases
ML platform with LLM support
N/A
We welcome contributions from the community! Here's how you can help:
Fork the repository
Create a new branch (git checkout -b feature/amazing-tool)
Add your contribution following our guidelines
Commit your changes (git commit -m 'Add amazing tool')
Push to the branch (git push origin feature/amazing-tool)
Open a Pull Request
Quality over quantity : Only add tools/resources you've personally used or thoroughly researched
Keep descriptions concise : 1-2 sentences maximum
Include GitHub stars badge : Use the format shown in existing entries
Maintain alphabetical order : Within each category
Check for duplicates : Search before adding
Update the Table of Contents : If adding new sections
Follow the existing format : Match the style of current entries
β
New tools, frameworks, or platforms
β
Useful resources, tutorials, or guides
β
Bug fixes or improvements to existing entries
β
Better descriptions or categorizations
β Promotional content or spam
β Outdated or unmaintained projects (unless historically significant)
See CONTRIBUTING.md for detailed guidelines.
This project is licensed under CC0 1.0 Universal. See LICENSE for details.
This repository is inspired by and builds upon several excellent awesome lists:
Special thanks to all contributors who help maintain and improve this resource!
If you find this repository helpful, please consider giving it a βοΈ
Made with β€οΈ by the community