This repository contains a series of notebooks that progressively build and evaluate LLM-powered chatbot systems and agents. The focus is on experimentation with safeguards, observability, and evaluation frameworks.
Stack: ollama, websearch-tool, gpt-oss, gpt-oss-safeguard.
- 5-basic-llm-tracing
- 6-chatbot-tracing
- 7-chatbot-evaluate
- 8-llm-compare-experiments
- 9-streamlit-mlflow-fastapi-ollama
Stack: streamlit, fastapi, mlflow tracing component, mlflow evaluate component, ollama, gpt-oss.
This project is under the MIT license.