Strategy: Build the "Ingestion Engine" first. If we can't get data in effortlessly, the system is useless.
Goal: Establish the local storage structure and the mechanism for receiving data from the outside world.
- US-1.1: As a user, I want a defined folder structure (
Inbox,Archive,Error) so that I know where to put files. - US-1.2: As a user, I want a Python script that watches the
Inboxfolder for new.jsonor.mdfiles. - US-1.3: As a user, I want the system to move processed files to
Archiveand failed files toErrorwith a log.
- Manual: Drop a file named
test.jsonintoInbox/. - Result: File disappears from
Inbox/and appears inArchive/YYYY-MM-DD/. - Automated: Run
pytest tests/test_daemon.py.
ingest_daemon.py(File watcher).- Directory structure setup script.
Goal: Build the logic to transform URLs and Tweets into clean Markdown.
- US-2.1 (Web): As a user, when I drop a URL into the Inbox, the system should fetch the page, strip the HTML boilerplate, and save the main content as Markdown.
- US-2.2 (Twitter): As a user, when I drop a Tweet URL, the system should (via
yt-dlp) fetch the Tweet text and Author. - US-2.3 (Text): As a user, when I drop a text snippet, it is saved directly.
- Web Test: Drop JSON
{ "type": "url", "payload": "https://example.com" }.- Result:
Archive/.../Example Domain.mdcreated with content.
- Result:
- Twitter Test: Drop JSON
{ "type": "url", "payload": "https://twitter.com/..." }.- Result:
Archive/.../tweet_123.mdcreated with text and author.
- Result:
- Automated: Run
pytest tests/test_connectors.py.
connectors/web_scraper.pyconnectors/twitter_fetcher.py
Goal: Make the data searchable and "smart".
- US-3.1: As a system, when a file is successfully ingested, I automatically generate a vector embedding for it.
- US-3.2: As a system, I store this embedding in a local ChromaDB collection.
- US-3.3: As a user, I can run a CLI command
os search "query"and get semantically relevant results.
- Ingest: Process 3 distinct notes (e.g., about "Apples", "SpaceX", "React").
- Search: Run
python main.py search "fruit". - Result: The "Apples" note is returned as the top result.
brain/vector_store.py(Chroma wrapper).brain/embedder.py(Embedding model wrapper).
Goal: Enable "Capture on the Go".
- US-4.1: As a user, I want an iOS Shortcut that accepts a "Share" input (URL/Text) and saves it as a file to my iCloud
OriginSteward/Inboxfolder. - US-4.2: As a user, I want a simple local web page (running on my Mac) where I can paste a URL if I'm on desktop.
- iOS Shortcut file (or instructions).
- Simple Flask/Streamlit "Drop" page.
Goal: Proactive value.
- US-5.1: As a user, I want a "Daily Digest" that shows me 3 random or relevant notes from the past.
- US-5.2: As a user, I want to ask "What have I read about X?" and get a synthesized answer (RAG).
steward/digest.pysteward/chat.py