This document allows another LLM session to continue development without breaking the system. Read this before making any changes.
- Name: Clawrity
- Type: AI Business Intelligence Platform
- Language: Python 3.11+
- Framework: FastAPI + LangChain + Groq/DeepSeek
- LLM Strategy (IMPORTANT — Cost Optimization):
- Groq (FREE): NL-to-SQL, QA scoring, draft generation, all non-critical tasks
- DeepSeek (PAID): Final polished chat responses ONLY (use sparingly)
- Rationale: Students with limited budget — maximize free API usage
- Channel Priority: Slack first, then Teams, then WhatsApp
- Data Source: BigQuery (real) with Mock Data Service fallback for demo
- Deployment: Azure VM with Docker containers (Terraform in
infra/) - Client Config: YAML files in
clients/directory — one file per client - No Hardcoded Secrets: All credentials via environment variables (.env file)
┌─────────────────────────────────────────────────────────────────┐
│ LLM USAGE STRATEGY │
├─────────────────────────────────────────────────────────────────┤
│ │
│ GROQ (FREE) DEEPSeek (PAID) │
│ ───────────── ──────────────── │
│ • NL-to-SQL generation • Final chat response polish │
│ • QA scoring/hallucination • (Only when explicitly needed) │
│ • Draft summaries │
│ • Headline extraction │
│ • All retries │
│ │
│ Model: llama-3.3-70b-versatile Model: deepseek-chat │
│ Cost: $0 Cost: Pay per token │
│ │
└─────────────────────────────────────────────────────────────────┘
User Query → NL-to-SQL (Groq) → Data Fetch → Gen Agent (Groq) → QA Agent (Groq) → Response
│
▼
Score < threshold?
│
Yes → Retry (max 3x, Groq)
No → Optional DeepSeek polish
│
▼
Final Response
- Each client has isolated config (YAML)
- Each client has isolated data (BigQuery dataset or mock)
- No shared state between clients
- Framework is client-agnostic
MockDataServicegenerates realistic business data- Same pipeline as real data (Gen Agent → QA Agent)
- Configurable via
use_mock: truein client YAML - Seed-based for reproducible demos
| File | Purpose |
|---|---|
src/clawrity/models/client.py |
Client configuration models |
src/clawrity/models/chat.py |
Chat request/response models |
src/clawrity/models/digest.py |
Digest report models |
src/clawrity/models/qa.py |
QA scoring models |
src/clawrity/models/rag.py |
RAG recommendation models (Phase 2) |
src/clawrity/models/forecast.py |
Forecast models (Phase 3) |
src/clawrity/config/settings.py |
Environment-based settings |
src/clawrity/config/client_loader.py |
YAML client config loader |
src/clawrity/services/mock_data.py |
Mock data for demo |
src/clawrity/services/data_service.py |
Unified data access (mock + BigQuery) |
src/clawrity/services/nl_to_sql.py |
NL-to-SQL query generation |
src/clawrity/agents/gen_agent.py |
Summary generation (Groq + DeepSeek) |
src/clawrity/agents/qa_agent.py |
Hallucination scoring (Groq) |
src/clawrity/agents/orchestrator.py |
Gen→QA pipeline with retry |
src/clawrity/utils/exceptions.py |
Custom exception hierarchy |
src/clawrity/utils/logging.py |
Structured logging setup |
src/clawrity/utils/formatters.py |
Markdown output formatters |
infra/ |
Terraform scripts for Azure deployment |
CLAWRITY_APP_NAME=clawrity
CLAWRITY_DEBUG=false
CLAWRITY_LOG_LEVEL=INFO
# Groq (FREE — use for all non-critical tasks)
CLAWRITY_GROQ_API_KEY=gsk-...
CLAWRITY_GROQ_MODEL=llama-3.3-70b-versatile
# DeepSeek (PAID — use only for final responses)
CLAWRITY_DEEPSEEK_API_KEY=sk-...
CLAWRITY_DEEPSEEK_BASE_URL=https://api.deepseek.com
CLAWRITY_DEEPSEEK_MODEL=deepseek-chat
# BigQuery (empty = use mock)
CLAWRITY_GOOGLE_APPLICATION_CREDENTIALS=
CLAWRITY_BIGQUERY_PROJECT_ID=
# Slack
CLAWRITY_SLACK_BOT_TOKEN=xoxb-...
CLAWRITY_SLACK_APP_TOKEN=xapp-...
# Server
CLAWRITY_HOST=0.0.0.0
CLAWRITY_PORT=8000
CLAWRITY_CLIENTS_DIR=clients
CLAWRITY_DATA_DIR=data- Unit tests: Test each model, service, and agent in isolation
- Integration tests: Test full pipeline with mock data
- Coverage target: 80%+ for business logic
- Run tests:
pytest tests/unit/from project root
Before committing:
ruff check .— No linting errors (TC001 suggestions OK)ruff format .— Consistent formattingpytest tests/unit/— All tests pass
- Conventional commits:
feat:,fix:,refactor:,test:,docs:,chore: - Atomic commits — one logical change per commit
- Never commit secrets or .env files
- Don't modify Pydantic model structure without updating all dependent code
- Don't change exception hierarchy without updating error handlers
- Don't add new dependencies without checking pyproject.toml compatibility
- Don't hardcode client-specific logic — use YAML config instead
- Don't use DeepSeek for non-critical tasks — use Groq (free) instead
- Phase 0: COMPLETE
- Phase 1: COMPLETE (core pipeline)
- Phase 2: NOT STARTED (RAG recommendations)
- Phase 3: NOT STARTED (ML forecasting)
- Get Groq API key (free at https://console.groq.com)
- Add keys to
.envfile - Test end-to-end with
uvicorn clawrity.api.app:app --reload - Deploy to Azure using
cd infra && terraform apply - Begin Phase 2 (RAG) or Phase 3 (Forecasting)