Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 10 additions & 5 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,16 +29,21 @@ Single CLI application built with Commander.js. All data stored in `~/.ft-bookma
| `src/bookmarks-viz.ts` | ANSI terminal dashboard |
| `src/chrome-cookies.ts` | Chrome cookie extraction (macOS Keychain) |
| `src/xauth.ts` | OAuth 2.0 flow |
| `src/graphql-user-sync.ts` | GraphQL sync for likes, timeline, and feed |
| `src/db.ts` | WASM SQLite layer (sql.js-fts5) |

### Data flow

```
Chrome cookies → GraphQL API → JSONL cache → SQLite FTS5 index
Regex classification
Search / List / Viz
Chrome cookies → GraphQL API → JSONL caches → SQLite FTS5 index
│ (bookmarks.jsonl,
│ likes.jsonl,
│ timeline.jsonl,
│ feed.jsonl)
Regex classification
Search / List / Viz
```

### Dependencies
Expand Down
31 changes: 23 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,13 +18,12 @@ Requires Node.js 20+. Chrome recommended for session sync; OAuth available for a
# 1. Sync your bookmarks (needs Chrome logged into X)
ft sync

# 2. Search them
ft search "distributed systems"
# 2. Sync likes, your timeline, or your feed
ft sync-likes yourhandle
ft sync-feed

# 3. Explore
ft viz
ft categories
ft stats
# 3. Search them
ft search "distributed systems"
```

On first run, `ft sync` extracts your X session from Chrome and downloads your bookmarks into `~/.ft-bookmarks/`.
Expand All @@ -43,15 +42,20 @@ On first run, `ft sync` extracts your X session from Chrome and downloads your b
| `ft sync --folder <name>` | Sync a single folder by name (exact or unambiguous prefix) |
| `ft sync --classify` | Sync then classify new bookmarks with LLM |
| `ft sync --api` | Sync via OAuth API (cross-platform) |
| `ft sync-likes <user>` | Sync liked tweets (no API required) |
| `ft sync-timeline <user>` | Sync your own tweets (no API required) |
| `ft sync-feed` | Sync your Following feed (no API required) |
| `ft auth` | Set up OAuth for API-based sync (optional) |

### Search and browse

| Command | Description |
|---------|-------------|
| `ft search <query>` | Full-text search with BM25 ranking |
| `ft list` | Filter by author, date, category, domain, or folder |
| `ft search --source likes` | Search within a specific source |
| `ft list` | Filter by author, date, category, domain, folder, or source |
| `ft list --folder <name>` | Show bookmarks in an X bookmark folder |
| `ft list --source <source>` | Filter by source (bookmarks, likes, timeline, feed) |
| `ft show <id>` | Show one bookmark in detail |
| `ft sample <category>` | Random sample from a category |
| `ft stats` | Top authors, languages, date range |
Expand Down Expand Up @@ -112,16 +116,24 @@ Then ask your agent:

> "I bookmarked a number of new open source AI memory tools. Pick the best one and figure out how to incorporate it in this repo."

> "What topics have I liked the most this month?"
>
> "Every day please sync any new X bookmarks using the Field Theory CLI."

Works with Claude Code, Codex, or any agent with shell access.

## Scheduling

```bash
# Sync every morning at 7am
# Sync bookmarks every morning at 7am
0 7 * * * ft sync

# Sync likes daily
0 7 * * * ft sync-likes yourhandle

# Sync feed every 6 hours
0 */6 * * * ft sync-feed

# Sync and classify every morning
0 7 * * * ft sync --classify
```
Expand All @@ -133,6 +145,9 @@ All data is stored locally at `~/.ft-bookmarks/`:
```
~/.ft-bookmarks/
bookmarks.jsonl # raw bookmark cache (one per line)
likes.jsonl # liked tweets cache
timeline.jsonl # your own tweets cache
feed.jsonl # Following feed cache
bookmarks.db # SQLite FTS5 search index
bookmarks-meta.json # sync metadata
oauth-token.json # OAuth token (if using API mode, chmod 600)
Expand Down
87 changes: 67 additions & 20 deletions src/bookmarks-db.ts
Original file line number Diff line number Diff line change
Expand Up @@ -2,12 +2,18 @@ import type { Database } from 'sql.js';
import { openDb, saveDb } from './db.js';
import { parseTimestampMs, toIsoDate } from './date-utils.js';
import { readJsonLines } from './fs.js';
import { twitterBookmarksCachePath, twitterBookmarksIndexPath } from './paths.js';
import {
twitterBookmarksCachePath,
twitterBookmarksIndexPath,
twitterLikesCachePath,
twitterTimelineCachePath,
twitterFeedCachePath,
} from './paths.js';
import type { BookmarkRecord, QuotedTweetSnapshot } from './types.js';
import { classifyCorpus, formatClassificationSummary } from './bookmark-classify.js';
import type { ClassificationSummary } from './bookmark-classify.js';

const SCHEMA_VERSION = 6;
const SCHEMA_VERSION = 7;

export interface SearchResult {
id: string;
Expand All @@ -26,6 +32,8 @@ export interface SearchOptions {
before?: string;
after?: string;
folder?: string;
/** Filter by source: bookmarks, likes, timeline, feed */
source?: string;
}

export interface BookmarkTimelineItem {
Expand Down Expand Up @@ -64,6 +72,8 @@ export interface BookmarkTimelineFilters {
category?: string;
domain?: string;
folder?: string;
/** Filter by source: bookmarks, likes, timeline, feed */
source?: string;
sort?: 'asc' | 'desc';
limit?: number;
offset?: number;
Expand Down Expand Up @@ -179,6 +189,10 @@ function buildBookmarkWhereClause(filters: BookmarkTimelineFilters): {
);
params.push(filters.folder);
}
if (filters.source) {
conditions.push(`b.source = ?`);
params.push(filters.source);
}

return {
where: conditions.length ? `WHERE ${conditions.join(' AND ')}` : '',
Expand Down Expand Up @@ -239,10 +253,12 @@ function initSchema(db: Database): void {
article_site TEXT,
enriched_at TEXT,
folder_ids TEXT,
folder_names TEXT
folder_names TEXT,
source TEXT DEFAULT 'bookmarks'
)`);

db.run(`CREATE INDEX IF NOT EXISTS idx_bookmarks_author ON bookmarks(author_handle)`);
db.run(`CREATE INDEX IF NOT EXISTS idx_bookmarks_source ON bookmarks(source)`);
db.run(`CREATE INDEX IF NOT EXISTS idx_bookmarks_posted ON bookmarks(posted_at)`);
db.run(`CREATE INDEX IF NOT EXISTS idx_bookmarks_language ON bookmarks(language)`);
db.run(`CREATE INDEX IF NOT EXISTS idx_bookmarks_category ON bookmarks(primary_category)`);
Expand Down Expand Up @@ -314,6 +330,9 @@ function ensureMigrations(db: Database): void {
ensureColumn(db, 'bookmarks', 'folder_ids', 'TEXT');
ensureColumn(db, 'bookmarks', 'folder_names', 'TEXT');

ensureColumn(db, 'bookmarks', 'source', "TEXT DEFAULT 'bookmarks'");
db.run('CREATE INDEX IF NOT EXISTS idx_bookmarks_source ON bookmarks(source)');

// FTS rebuild: only if the FTS table is missing the article_text column.
// Check via a zero-row SELECT so we don't rebuild unnecessarily.
if (!ftsHasColumn(db, 'article_text')) {
Expand Down Expand Up @@ -350,15 +369,15 @@ function serializeJsonArray(values: string[] | undefined | null): string | null
return JSON.stringify(values);
}

function insertRecord(db: Database, r: BookmarkRecord, preserved?: PreservedBookmarkFields): void {
function insertRecord(db: Database, r: BookmarkRecord, source: string = 'bookmarks', preserved?: PreservedBookmarkFields): void {
// Extract GitHub URLs (kept inline — no LLM needed for URL parsing)
const text = r.text ?? '';
const githubMatches = text.match(/github\.com\/[\w.-]+\/[\w.-]+/gi) ?? [];
const githubFromLinks = (r.links ?? []).filter((l) => /github\.com/i.test(l));
const githubUrls = [...new Set([...githubMatches.map((m) => `https://${m}`), ...githubFromLinks])];

db.run(
`INSERT OR REPLACE INTO bookmarks VALUES (${Array(37).fill('?').join(',')})`,
`INSERT OR REPLACE INTO bookmarks VALUES (${Array(38).fill('?').join(',')})`,
[
r.id,
r.tweetId,
Expand Down Expand Up @@ -397,14 +416,31 @@ function insertRecord(db: Database, r: BookmarkRecord, preserved?: PreservedBook
preserved?.enrichedAt ?? null,
serializeJsonArray(r.folderIds) ?? preserved?.folderIds ?? null,
serializeJsonArray(r.folderNames) ?? preserved?.folderNames ?? null,
source,
]
);
}

export async function buildIndex(options?: { force?: boolean }): Promise<{ dbPath: string; recordCount: number; newRecords: number }> {
const cachePath = twitterBookmarksCachePath();
const dbPath = twitterBookmarksIndexPath();
const records = await readJsonLines<BookmarkRecord>(cachePath);

// Collect records from all sources
const sources: Array<{ path: string; source: string }> = [
{ path: twitterBookmarksCachePath(), source: 'bookmarks' },
{ path: twitterLikesCachePath(), source: 'likes' },
{ path: twitterTimelineCachePath(), source: 'timeline' },
{ path: twitterFeedCachePath(), source: 'feed' },
];

const taggedRecords: Array<{ record: BookmarkRecord; source: string }> = [];
for (const { path, source } of sources) {
try {
const records = await readJsonLines<BookmarkRecord>(path);
for (const record of records) {
taggedRecords.push({ record, source });
}
} catch { /* file may not exist */ }
}

const db = await openDb(dbPath);
try {
Expand Down Expand Up @@ -448,13 +484,13 @@ export async function buildIndex(options?: { force?: boolean }): Promise<{ dbPat
}
} catch { /* table may be empty */ }

const newRecords: BookmarkRecord[] = records.filter(r => !existingRows.has(r.id));
const newEntries = taggedRecords.filter(({ record }) => !existingRows.has(record.id));

if (records.length > 0) {
if (taggedRecords.length > 0) {
db.run('BEGIN TRANSACTION');
try {
for (const record of records) {
insertRecord(db, record, existingRows.get(record.id));
for (const { record, source } of taggedRecords) {
insertRecord(db, record, source, existingRows.get(record.id));
}
db.run('COMMIT');
} catch (err) {
Expand All @@ -468,7 +504,7 @@ export async function buildIndex(options?: { force?: boolean }): Promise<{ dbPat

saveDb(db, dbPath);
const totalRows = db.exec('SELECT COUNT(*) FROM bookmarks')[0]?.values[0]?.[0] as number;
return { dbPath, recordCount: totalRows, newRecords: newRecords.length };
return { dbPath, recordCount: totalRows, newRecords: newEntries.length };
} finally {
db.close();
}
Expand Down Expand Up @@ -530,6 +566,10 @@ export async function searchBookmarks(options: SearchOptions): Promise<SearchRes
conditions.push(`b.posted_at <= ?`);
params.push(options.before);
}
if (options.source) {
conditions.push(`b.source = ?`);
params.push(options.source);
}

const where = conditions.length > 0 ? `WHERE ${conditions.join(' AND ')}` : '';

Expand Down Expand Up @@ -779,7 +819,7 @@ export async function getBookmarkById(id: string): Promise<BookmarkTimelineItem
}
}

export async function getStats(): Promise<{
export async function getStats(options?: { source?: string }): Promise<{
totalBookmarks: number;
uniqueAuthors: number;
dateRange: { earliest: string | null; latest: string | null };
Expand All @@ -788,19 +828,25 @@ export async function getStats(): Promise<{
}> {
const dbPath = twitterBookmarksIndexPath();
const db = await openDb(dbPath);
ensureMigrations(db);

const src = options?.source;
const sourceFilter = src ? 'WHERE source = ?' : '';
const sourceAnd = src ? 'AND source = ?' : '';
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing ensureMigrations call in getStats

Medium Severity

getStats now queries the source column (added in schema v5) when options.source is provided, but unlike searchBookmarks, listBookmarks, and other DB-reading functions, it never calls ensureMigrations(db). On a pre-v5 database, running ft stats --source likes will crash because the source column doesn't exist yet.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 5984733. Configure here.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 64163a2 — added ensureMigrations(db) call in getStats.


try {
const total = db.exec('SELECT COUNT(*) FROM bookmarks')[0]?.values[0]?.[0] as number;
const authors = db.exec('SELECT COUNT(DISTINCT author_handle) FROM bookmarks')[0]?.values[0]?.[0] as number;
const postedAtRows = db.exec('SELECT posted_at FROM bookmarks WHERE posted_at IS NOT NULL');
const total = db.exec(`SELECT COUNT(*) FROM bookmarks ${sourceFilter}`, src ? [src] : [])[0]?.values[0]?.[0] as number;
const authors = db.exec(`SELECT COUNT(DISTINCT author_handle) FROM bookmarks ${sourceFilter}`, src ? [src] : [])[0]?.values[0]?.[0] as number;
const postedAtRows = db.exec(`SELECT posted_at FROM bookmarks WHERE posted_at IS NOT NULL ${sourceAnd}`, src ? [src] : []);
const range = chronologicalDateRange(
(postedAtRows[0]?.values ?? []).map((row) => row[0])
);

const topAuthorsRows = db.exec(
`SELECT author_handle, COUNT(*) as c FROM bookmarks
WHERE author_handle IS NOT NULL
GROUP BY author_handle ORDER BY c DESC LIMIT 15`
WHERE author_handle IS NOT NULL ${sourceAnd}
GROUP BY author_handle ORDER BY c DESC LIMIT 15`,
src ? [src] : []
);
const topAuthors = (topAuthorsRows[0]?.values ?? []).map((r) => ({
handle: r[0] as string,
Expand All @@ -809,8 +855,9 @@ export async function getStats(): Promise<{

const langRows = db.exec(
`SELECT language, COUNT(*) as c FROM bookmarks
WHERE language IS NOT NULL
GROUP BY language ORDER BY c DESC LIMIT 10`
WHERE language IS NOT NULL ${sourceAnd}
GROUP BY language ORDER BY c DESC LIMIT 10`,
src ? [src] : []
);
const languageBreakdown = (langRows[0]?.values ?? []).map((r) => ({
language: r[0] as string,
Expand Down
Loading