Agentic AI Engineering

Agentic AI Engineering

Stop reading about agents. Start building them.

This is the repo for engineers who want to understand what's behind popular agents like Claude Code, Codex, and GitHub Copilot and how to build one by yourself. From your first LLM call to a production eval harness.

Building AI agents is engineering, not magic. Master the constraints, not the hype.

❓ Why you need to learn this?

▶️ Fundamental skills and knowledge you must have in 2026 for SWE

How many of you actually can pull out a whiteboard and build me an agent? Can you show me the inferencing loop?

If you don't know this, your career is in jeopardy.

What is a tool call? If you don't know what that is, you need to learn what it is and all these basic fundamentals. I preference candidates if they know what a tool call is, how the inferencing loop works, pull out a whiteboard — the same way we used to say, show me a linked list, reverse me this data structure.

This is now baseline knowledge because we're getting candidates in that can answer this stuff.

— Geoffrey Huntley, creator of Ralph Wiggum

Agent fluency is the new data-structures interview. We teach it from first principles - you build the loop, the tool calls, the memory, and the evals yourself before we ever introduce a framework. No magic. No black boxes. Just the primitives, in the order they were invented.

💡 None of this requires fancy frameworks. Just an LLM API, some tools, and a loop. Build one this weekend. You'll understand agents better than reading 100 blog posts.

No prior AI/ML experience required - just Python basics and curiosity about building LLM-powered agents.

🧭 Why this repo?

We take production agents apart. The Disassembling AI Agents Substack series reverse-engineers Claude Code, GitHub Copilot, and OpenCode. You read how real agents work, then rebuild the pieces here.
First principles, no black boxes. You build the agent loop, the tool executor, the memory layer, the eval harness from scratch — before we introduce a single framework. Learn what each abstraction is hiding before you let one hide it for you.
Runnable in one command. uv run --directory <tutorial> python <script>.py. No conda dance. No Jupyter kernel hunt.

⚡ 60-second quickstart

brew install uv   # or: pipx install uv
git clone https://github.com/agenticloops-ai/agentic-ai-engineering.git
cd agentic-ai-engineering
cp .env.example .env   # add your Anthropic and/or OpenAI keys

uv run --directory 01-foundations/01-simple-llm-call python 01_llm_call_anthropic.py

That's it. Every tutorial is self-contained and idempotent — you can jump in anywhere. Full setup details in SETUP.md. Or open in Codespaces and skip local setup entirely.

If you find this useful, a ⭐️ star helps us know we're on the right track. Join the 💬 discussion or report an 🐛 issue — your input directly shapes what we build next.

📚 Companion reading

The tutorials teach you to build. Our Substack gives you the mental model first - a foundational primer on how agents actually work, followed by teardowns of real production agents you use every day. Read the post. Open the tutorial. Rebuild the pattern.

How Agents Work: The Patterns Behind the Magic - the core agentic loop from first principles. The four pattern levels (one-shot → single-tool → ReAct → planning), the role of the system prompt as behavioral design, and Ralph Mode as the outer loop. If you read one thing before opening the repo, read this. Pairs with → 01-foundations.

🗂️ Tutorials Structure

The tutorials are organized into modules (01-foundations, 02-effective-agents) that progress from basics to advanced concepts. Each module contains numbered tutorials that build on previous lessons. Inside each tutorial folder, you'll find:

Python scripts - Self-contained, runnable examples demonstrating key concepts
README.md - Detailed explanations, code walkthroughs, and learning objectives

You can explore individual scripts independently or follow the complete learning path from start to finish. Each module ends with a project that combines all concepts from the module into a single, production-style agent.

🎓 01 - Foundations

Your first steps — from a single API call to a fully autonomous agent loop. Build everything from scratch to understand what's really happening under the hood.

Simple LLM Call — First API call with token tracking
Prompt Engineering — Guide model behavior
Chat — Interactive chat with message history
Tool Use — Enable function calling
Agent Loop — Autonomous tool-using agents
Codebase Navigator — The Augmented LLM with RAG, tools, and memory

🧩 02 - Effective Agents Patterns

Architectural patterns that separate toy demos from real agents. Based on Anthropic's "Building Effective Agents" — learn when to chain, route, parallelize, or delegate.

Prompt Chaining — Sequential multi-step pipelines
Routing — Classify input, dispatch to specialized handlers
Parallelization — Fan-out/fan-in, parallel tool calls
Orchestrator-Workers — Dynamic task decomposition
Evaluator-Optimizer — Self-critique, iterative refinement
Human in the Loop — Approval gates, escalation, feedback
Content Writer — Full agent composing all agentic workflow patterns

🧬 03 - Advanced Techniques

Practical engineering problems you'll hit the moment agents leave the prototype stage. Context, cost, memory, multimodality, safety — solved one tutorial at a time.

Structured Output — JSON mode, schemas, constrained generation
Streaming — SSE, token-by-token output, streaming tool calls
Context Engineering — Window strategies, summarization, tool context
Cost Optimization — Prompt caching, model routing
Memory — Short-term, long-term, memory inspection
RAG Techniques — Hybrid search, agentic retrieval
Multimodal — Vision, image generation, audio
Guardrails — Input/output filtering, safety patterns

🧪 04 - Testing & Evaluation

Agents are non-deterministic — testing them requires different thinking. Measure quality, catch regressions, and build confidence before shipping.

Unit Testing Agents — Mocking LLMs, deterministic tests
Evals — Accuracy, quality, regression benchmarks
Tracing & Debugging — Observability during development
Red Teaming & Safety — Adversarial testing, guardrails
Benchmarking — Comparing models, prompts, architectures head-to-head
Eval Frameworks — Promptfoo, Braintrust, Langfuse integration
Eval Harness — Complete eval pipeline combining all techniques

🏗️ 05 - Frameworks

One agent, nine implementations. Build the same system with each framework and compare trade-offs with your own hands.

No Framework — Raw SDK baseline
LangGraph — Graph-based orchestration
Pydantic AI — Type-safe agents
Google ADK — Google's Agent Development Kit
AWS Strands — AWS agent SDK
CrewAI — Role-based multi-agent collaboration
AutoGen — Multi-agent conversations
LlamaIndex — Data-centric agents
Semantic Kernel — Microsoft AI orchestration

🏭 06 - Production

The gap between "works on my laptop" and "runs reliably at scale." Principles, deployment, monitoring, cost control, and security.

12-Factor Agents — Principles for production-grade agents
Deployment Strategies — Containers, serverless, scaling
Monitoring & Observability — Metrics, logging, tracing in prod
Cost Optimization — Token budgets, caching, model routing
Security & Guardrails — Auth, sandboxing, injection defense
Error Handling & Resilience — Retries, fallbacks, graceful degradation

💜 Support Us

If you find this project useful, consider supporting us:

💬 FAQ

Module not found? Run uv sync in the lesson directory.

API errors or authentication failures? You need API keys from Anthropic, OpenAI, or both, depending on which examples you run. See SETUP.md for details.

⚖️ License

This project is licensed under the MIT License - see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 136 Commits
.claude		.claude
.github/workflows		.github/workflows
.vscode		.vscode
01-foundations		01-foundations
02-effective-agents		02-effective-agents
03-advanced-techniques		03-advanced-techniques
04-testing-evaluation		04-testing-evaluation
05-frameworks		05-frameworks
06-production		06-production
cli		cli
common		common
.env.example		.env.example
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
DEVELOPMENT.md		DEVELOPMENT.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
SETUP.md		SETUP.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Agentic AI Engineering

❓ Why you need to learn this?

🧭 Why this repo?

⚡ 60-second quickstart

📚 Companion reading