MCP 101

Calling a tool

Make sure that nothing is listening on ports 8000 and 8080. Open 3 generously sized terminals on your screen.
Download a sensible model. Qwen 3.5 4B is sensible.

Compile fresh llama.cpp:

git clone https://github.com/ggml-org/llama.cpp && cd llama.cpp
cmake -B build && cmake --build build --config Release -j 6

Launch the llama in terminal #1:

./llama-server -m ~/Downloads/Qwen3.5-4B-Q8_0.gguf --ctx-size 4096 --temp 1.0 --top-p 0.95 --top-k 20 --min-p 0.00 --verbose --webui-mcp-proxy

Clone this repository:

https://github.com/behavioral-ds/mcp-example && cd mcp-example

Open llama web UI at http://localhost:8080/, go to settings and add a new MCP server:
Select "MCP prompt" when drafting a new message:
That's your @mcp.prompt() parsed into UI element, click it:
...and supply some meaningful content:
Then click "Use prompt" and rejoice:

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
README.md		README.md
call.py		call.py
mcp_serve.py		mcp_serve.py
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml