Sharpie

Self-hostable AI prompt playground with local LLM support

Build, test, and share AI prompts with zero API costs. Run entirely on your machine with Docker.

Features

One-command setup - docker-compose up and you're running
Fully self-hosted - Your prompts never leave your machine
Zero API costs - Uses local Ollama models (qwen2.5:3b by default)
Share & Fork - Generate shareable URLs for any prompt
Real-time streaming - Watch AI responses generate live
Markdown rendering - Beautiful formatted responses with syntax highlighting
GPU accelerated - Leverages your NVIDIA GPU automatically
No dependencies - Everything runs in Docker containers

Screenshots

Main Interface

Response

Fork & Share

Backend

Settings

Quick Start

Prerequisites

Docker Desktop installed
10GB free disk space (for Ollama model)
(Optional) NVIDIA GPU with CUDA support

Installation

# Clone the repository
git clone https://github.com/heyrtl/sharpie.git
cd sharpie

# Start all services
docker-compose up --build

That's it! Open http://localhost:5173 in your browser.

First run takes 5-10 minutes to download the Qwen2.5-3B model (~2GB).

Usage

Basic Workflow

Write your prompts - System and user prompts in the editor
Run - Press Cmd/Ctrl + Enter or click "Run Prompt"
Share - Click "Share" to get a shareable URL
Fork - Click "Fork" to create a copy and modify

Sharing Prompts

Share URLs like http://localhost:5173?p=abc123 with anyone running Sharpie. They can:

View your prompt
Run it with their local model
Fork and modify it

Switching Models

Click the settings icon
Select from available Ollama models
Models are auto-detected from your Ollama instance

Architecture

Frontend (React + Vite)
    ↓
Backend (FastAPI)
    ↓
Ollama (Local LLM)
    ↓
SQLite (Prompt Storage)

Frontend: React app with real-time streaming UI
Backend: FastAPI server handling prompts and streaming
Ollama: Local LLM inference with GPU acceleration
SQLite: Embedded database for saved prompts

Configuration

Using Different Models

Pull any Ollama model:

docker exec -it sharpie-ollama ollama pull llama3.2:3b
docker exec -it sharpie-ollama ollama pull mistral:7b

Then select it in Settings.

Environment Variables

Copy .env.example to .env and customize:

OLLAMA_HOST=http://ollama:11434
DATABASE_PATH=/app/data/sharpie.db

GPU Support

GPU is auto-detected. To disable GPU and run CPU-only, remove the deploy section from docker-compose.yml.

Development

Project Structure

sharpie/
├── backend/           # FastAPI server
│   ├── main.py       # API routes
│   ├── database.py   # SQLite handlers
│   ├── models.py     # Pydantic models
│   └── utils.py      # Helpers
├── frontend/          # React app
│   └── src/
│       ├── App.jsx
│       ├── components/
│       └── utils/
└── docker-compose.yml

Running Locally (without Docker)

Backend:

cd backend
pip install -r requirements.txt
uvicorn main:app --reload

Frontend:

cd frontend
npm install
npm run dev

Make sure Ollama is running separately.

Troubleshooting

Port Already in Use

Change ports in docker-compose.yml:

ports:
  - "8001:8000"  # Backend
  - "5174:5173"  # Frontend

Ollama Model Not Downloading

Manually pull the model:

docker exec -it sharpie-ollama ollama pull qwen2.5:3b

GPU Not Detected

Check NVIDIA Docker runtime:

docker run --rm --gpus all nvidia/cuda:11.0-base nvidia-smi

If it fails, you may need to install nvidia-container-toolkit.

Out of Disk Space

The Ollama model requires ~2GB. Free up space or use a smaller model:

docker exec -it sharpie-ollama ollama pull qwen2.5:0.5b

Contributing

Contributions are welcome! Here's how:

Fork the repository
Create a feature branch (git checkout -b feature/amazing-feature)
Commit changes (git commit -m 'Add amazing feature')
Push to branch (git push origin feature/amazing-feature)
Open a Pull Request

Development Setup

See CONTRIBUTING.md for detailed setup instructions.

Roadmap

Security

See SECURITY.md for security considerations and best practices.

License

MIT License - see LICENSE for details.

Author

Ratul Rahman (@heyrtl)

Website: ratul-rahman.com
GitHub: @heyrtl
Twitter: @heyrtl

Acknowledgments

Ollama for local LLM inference
FastAPI for the backend framework
React for the frontend
Qwen team for the excellent small language models

Star History

If you find this useful, consider giving it a star!

Built with care for the prompt engineering community.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
backend		backend
frontend		frontend
screenshots		screenshots
.env.example		.env.example
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
TESTING.md		TESTING.md
docker-compose.yml		docker-compose.yml
test-locally.sh		test-locally.sh

Folders and files

Latest commit

History

Repository files navigation

Sharpie

Features

Screenshots

Main Interface

Response

Fork & Share

Backend

Settings

Quick Start

Prerequisites

Installation

Usage

Basic Workflow

Sharing Prompts

Switching Models

Architecture

Configuration

Using Different Models

Environment Variables

GPU Support

Development

Project Structure

Running Locally (without Docker)

Troubleshooting

Port Already in Use

Ollama Model Not Downloading

GPU Not Detected

Out of Disk Space

Contributing

Development Setup

Roadmap

Security

License

Author

Acknowledgments

Star History

About

Topics

Resources

License

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Contributors 1

Languages