Skip to content

behavioral-ds/mcp-example

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MCP 101

Calling a tool

  1. Make sure that nothing is listening on ports 8000 and 8080. Open 3 generously sized terminals on your screen.
  2. Download a sensible model. Qwen 3.5 4B is sensible.
  3. Compile fresh llama.cpp:
    git clone https://github.com/ggml-org/llama.cpp && cd llama.cpp
    cmake -B build && cmake --build build --config Release -j 6
  4. Launch the llama in terminal #1:
    ./llama-server -m ~/Downloads/Qwen3.5-4B-Q8_0.gguf --ctx-size 4096 --temp 1.0 --top-p 0.95 --top-k 20 --min-p 0.00 --verbose --webui-mcp-proxy
  5. Clone this repository:
    https://github.com/behavioral-ds/mcp-example && cd mcp-example
  6. Install deps: poetry install && poetry shell
  7. Launch MCP in terminal #2: python mcp_serve.py
  8. Execute the Agentic Call™ in terminal #3: python call.py
  9. Observe the dance between LLM <-> Inference engine <-> MCP <-> Client.

Using MCP prompts

  1. Open llama web UI at http://localhost:8080/, go to settings and add a new MCP server:

  2. Select "MCP prompt" when drafting a new message:

  3. That's your @mcp.prompt() parsed into UI element, click it:

  4. ...and supply some meaningful content:

  5. Then click "Use prompt" and rejoice:

image

About

Tiny MCP example

Resources

Stars

Watchers

Forks

Contributors

Languages