Skip to content

Feature: Backup classifications to JSONL to survive SQLite drops #136

@ccage-simp

Description

@ccage-simp

Feature: Backup classifications to JSONL to survive SQLite drops

Is your feature request related to a problem? Please describe.
Currently, the expensive LLM classification data (categories, primaryCategory, domains, primaryDomain) only lives in the local SQLite bookmarks.db. If the database is dropped or corrupted (e.g., via ft index --force or during a schema migration), all classification data is lost and the user needs to re-classify their bookmarks.

The Multi-Machine Sync Use Case
This also affects users who sync their ~/.ft-bookmarks/ folder across different computers using a private Git repository. Because bookmarks.db is a binary file (and is correctly .gitignore'd), pulling the latest bookmarks.jsonl to a second machine and running ft index results in a fully populated database, but without any of the classifications. This results in redundant LLM usage to categorize the same bookmarks on each machine.

Describe the solution you'd like
Implement a secondary "source of truth" file: ~/.ft-bookmarks/classifications.jsonl. This file will serve as an append-only ledger for all classification results.

  1. The "Save" Hook: Whenever ft classify or ft classify-domains finishes a batch, it appends the results to classifications.jsonl.
  2. The "Self-Heal" Hook: Before ft index rebuilds the database, it loads classifications.jsonl into memory. When inserting or updating a bookmark in SQLite, it merges the classification data in, hydrating the DB without needing new LLM calls.
  3. The "Export" Utility: Add ft classify --export to scan the current database and generate the classifications.jsonl file for existing users.

Additional Context
I have implemented this architecture in Draft PR #137. It protects against data loss during database rebuilds and makes syncing a classified library across multiple machines much more efficient.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions