feat: classification persistence via classifications.jsonl#137
Draft
ccage-simp wants to merge 1 commit into
Draft
feat: classification persistence via classifications.jsonl#137ccage-simp wants to merge 1 commit into
ccage-simp wants to merge 1 commit into
Conversation
23a74e6 to
a0961cc
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
This PR implements "Classification Persistence" to protect classification data if the SQLite database is dropped, and to allow classifications to be synced across machines.
It introduces a secondary source of truth:
~/.ft-bookmarks/classifications.jsonl.The Use Case: Multi-Machine Syncing
Users who sync their
~/.ft-bookmarks/folder across computers via Git currently lose their classifications becausebookmarks.dbis ignored. Runningft indexon a second machine builds a fresh database with zero classifications, leading to redundant LLM usage to re-categorize the same data.The Implementation
classifyWithLlmandclassifyDomainsWithLlmto append aClassificationRecordtoclassifications.jsonlwhenever a batch successfully returns from the LLM.buildIndexinsrc/bookmarks-db.tsto readclassifications.jsonlinto memory prior to rebuilding. Unclassified records that have a backup in JSONL are automatically hydrated with their categories and domains.ft classify --exportto allow existing users to dump their current SQLite classifications into the JSONL backup file.This ensures that classifications are preserved during index rebuilds and can be tracked in Git alongside
bookmarks.jsonlfor easier cross-machine syncing.