This guide walks you through using moves from installation to giving your first hands-free presentation.
- Installation
- Initial Setup
- Creating Your First Speaker
- Preparing for Presentation
- Giving a Presentation
- Managing Speakers
- Troubleshooting
- FAQ
moves requires Python 3.13 or newer.
On Windows:
- Download from python.org
- Run installer, check "Add Python to PATH"
- Verify: Open PowerShell and type
python --version
pip install uvOr use your preferred package manager. uv is optional; you can also use pip directly.
uv tool install moves-cliOr with pip:
pip install moves-climoves --versionYou should see something like: moves-cli version 0.3.2
moves stores all speaker data in: C:\Users\<YourUsername>\.moves\
~/.moves/
├── settings.toml # Your LLM configuration
└── speakers/
└── <speaker-id>/
├── speaker.yaml # Speaker metadata
└── sections.md # Speech content for slides
Check this directory:
dir $env:USERPROFILE\.movesYou'll need:
- Presentation file – Your slides in PDF, DOCX, PPTX, or TXT format (e.g.,
my_talk.pdf,my_talk.pptx) - Transcript file – Text file with what you'll say (e.g.,
my_talk.txt)
Supported presentation formats: All formats are 100% free with no commercial licenses required!
- PDF - Via PyMuPDF4LLM (optimized for LLM processing)
- DOCX - Via python-docx (Microsoft Word documents)
- PPTX - Via python-pptx (PowerPoint presentations)
- TXT - Native text file support
The transcript should be organized roughly by slide, like:
Slide 1: Hello everyone, thanks for coming today. I'm excited to share this project.
Slide 2: This is the overview. We have three main topics to cover...
Slide 3: First topic, diving deeper now...
Run:
moves speaker add MyTalk C:\path\to\my_talk.pdf C:\path\to\my_talk.txtReplace paths with your actual files. You'll see:
Speaker MyTalk (a1b2c) has been successfully added.
Data directory:
C:\Users\<YourUsername>\.moves\speakers\a1b2c
Presentation source:
C:\path\to\my_talk.pdf
Transcript source:
C:\path\to\my_talk.txt
Congratulations! You've created your first speaker profile.
If your files are on Google Drive:
moves speaker add MyTalk `
"https://drive.google.com/file/d/1ABC2DEF3GHI/view?usp=sharing" `
"https://drive.google.com/file/d/1JKL2MNO3PQR/view?usp=sharing"The tool will download them automatically and authenticate if needed.
This is the easiest method. An LLM analyzes your transcript and generates speech content for each slide.
First, you need to set up an LLM. Here are popular free options:
Using Google Gemini (Recommended for Free):
- Go to Google AI Studio
- Create a new API key (free tier available)
- In
moves, run:moves settings set model gemini/gemini-2.5-flash-lite moves settings set format chat # Optional endpoint override: # moves settings set base_url https://your-openai-compatible-endpoint/v1 moves settings set key
- Paste your API key when prompted (text will be hidden)
Using OpenAI:
- Create account at OpenAI
- Get API key from account settings
- In
moves, run:moves settings set model gpt-4o-mini moves settings set format chat # Optional endpoint override: # moves settings set base_url https://your-openai-compatible-endpoint/v1 moves settings set key
Using other providers: See Configuration Guide for more options.
moves speaker prepare MyTalkThis will:
- Analyze your PDF slides
- Process your transcript
- Use LLM to generate speech content for each slide
- Create
sections.mdfile
Progress bar shows processing time. Takes 30-60 seconds typically.
If you don't want to use an LLM, or prefer full control:
moves speaker prepare MyTalk --manualThis creates an empty template. You then manually edit sections.md:
# Open the sections file in your editor
notepad $env:USERPROFILE\.moves\speakers\a1b2c\sections.mdThe file looks like:
# 1. Slide
Content for slide 1
# 2. Slide
Content for slide 2
# 3. Slide
Content for slide 3Add your speech notes for each slide:
# 1. Slide
Hello everyone, thanks for coming. I'm excited to share this.
# 2. Slide
This is the overview. Three main topics today.
# 3. Slide
First, let me dive deeper into the architecture...Save and you're ready!
moves present MyTalkYou'll see a dashboard showing:
- Current slide / total slides
- Recognized speech
- Similarity scores
- System status
Microphone listening: Speak naturally about each slide's content. When your speech matches the section in sections.md, the slide advances automatically.
Keyboard shortcuts (in case you need to help):
←– Go to previous slide→– Go to next slideM– Pause/Resume listening (stop for questions, resume when ready)Q– Exit presentation
Tips for best results:
- Speak clearly – Articulate words distinctly
- Match your script – Use similar phrasing to what's in
sections.md - Steady pace – Normal conversational speed works best
- Quiet environment – Less background noise = better recognition
- Test first – Do a dry run before the real presentation
Press Ctrl+C to stop. You'll see:
Presentation ended.
moves speaker listShows:
There are 2 registered speaker(s).
NAME ID STATUS LAST PROCESSED
MyTalk a1b2c Ready 2 hours ago
OtherTalk d1e2f Not Ready Never
moves speaker show MyTalkShows all metadata, paths, and status.
If you fix your PDF or update your transcript:
moves speaker edit MyTalk `
--presentation C:\new\my_talk.pdf `
--transcript C:\new\my_talk.txtThen re-prepare:
moves speaker prepare MyTalkmoves speaker delete MyTalkOr delete all:
moves speaker delete --allYou'll be asked to confirm. All data will be removed.
Problem: moves speaker list shows no speakers.
Solution:
- Check if any exist:
dir $env:USERPROFILE\.moves\speakers - If directory is empty, add a speaker:
moves speaker add ... - If directory exists but not showing, check permissions on
~/.moves/
Problem: moves present MyTalk says speaker hasn't been prepared.
Solution:
moves speaker prepare MyTalkIf using LLM, check your LLM is configured:
moves settings listProblem: During presentation, no speech is being recognized.
Solutions:
- Test Windows audio: Settings → Sound → Volume mixer
- Check if microphone is muted: Click volume icon, unmute microphone
- Test microphone: Run
moves present MyTalk, try speaking - Try a different microphone if available
Problem: You're speaking but nothing is recognized.
Causes & Solutions:
- Content mismatch – Your sections.md might not match what you're actually saying
- Edit sections.md to match your real script
- Re-prepare with better transcript
- Accent/pronunciation – STT might struggle with your accent
- Speak more clearly and deliberately
- Use "Pause" (M key) between sections
- Background noise – Too much noise confuses the model
- Find a quieter room
- Close notifications, silence phone
Problem: Can't prepare speaker, says model or API key not configured.
Solution:
moves settings set model gemini/gemini-2.5-flash-lite
moves settings set format chat
moves settings set key
# (paste your API key when prompted)Or use manual mode (no LLM):
moves speaker prepare MyTalk --manualProblem: Preparing again warns that "sections.md has been modified".
Solution: This is normal if you edited sections.md manually. Choose:
- Yes (y) – Continue, keep your edits
- Save (s) – Update the hash so it doesn't warn again
- No (n) – Cancel and don't update
Problem: First run is very slow.
Reason: ONNX models (~400-500MB) are downloading.
Solution:
- Wait for completion (5-10 minutes depending on internet)
- Check:
dir $env:USERPROFILE\.moves\ml_modelsto see download progress
A: No. Speech recognition happens offline using local ONNX models. Your voice never leaves your machine. The only cloud call is to the LLM (if you use automatic preparation), which is optional—you can use --manual mode instead.
A: Yes! moves-cli supports multiple formats with 100% free, open-source libraries:
- PDF - PyMuPDF4LLM (optimized for LLM processing)
- DOCX - python-docx (Word documents)
- PPTX - python-pptx (PowerPoint presentations)
- TXT - Native text support
No commercial licenses or PyMuPDF Pro required! Just provide the file path directly:
moves speaker add MyTalk C:\talks\my_talk.pptx C:\talks\my_talk.txtA: It's OK! The tool is designed to be flexible:
- Add approximate content for each slide in
sections.md - During presentation, it matches your speech approximately
- If not matching, use keyboard shortcuts (← →) to navigate manually
A: Yes! Create multiple speakers:
moves speaker add Talk1 C:\talks\talk1.pdf C:\talks\talk1.txt
moves speaker add Talk2 C:\talks\talk2.pdf C:\talks\talk2.txt
moves present Talk1 # First presentation
moves present Talk2 # Different presentationA: Accuracy depends on:
- Content match – How well your
sections.mdmatches what you say - Speech clarity – Clear speech recognizes better
- Audio quality – Quiet environment works best
Typical accuracy: 85-95% automatic advances with manual backups via keyboard.
A: Yes! Any provider supported by LiteLLM works:
moves settings set model claude-3-5-sonnet
# or
moves settings set model gpt-4o
# or others...See Configuration Guide for full list.
A: You can re-prepare to regenerate it:
moves speaker prepare MyTalkThis will overwrite your manual edits. If you had custom content, keep a backup.
A: Partially:
- Presentation phase – Yes, fully offline after models are downloaded
- Preparation phase – Requires internet for LLM, unless using
--manualmode
A:
- Models: ~400-500MB (one-time download)
- Speaker data: ~1-10MB per speaker (depending on presentation size)
- Total: ~500MB + speaker data
A: Yes! moves-cli supports multiple formats:
- PDF - Direct support (no export needed)
- PPTX - Direct support with PyMuPDF Pro
- DOCX - Direct support with PyMuPDF Pro
If you prefer PDF or don't have PyMuPDF Pro:
- PowerPoint: File → Export as PDF or use .pptx directly
- Google Slides: File → Download → PDF (or PPTX)
- LibreOffice: File → Export as PDF
Then use the file with moves.
A: Interrupting (Ctrl+C) will stop the process. You can restart:
moves speaker prepare MyTalkIt will re-process from the beginning.
Need more help? Check the Architecture Guide for technical details or see the CLI Reference for all available commands.