Skip to content

feat: capture bookmark reply threads#141

Open
ericlitman wants to merge 5 commits into
afar1:mainfrom
ericlitman:codex/thread-capture
Open

feat: capture bookmark reply threads#141
ericlitman wants to merge 5 commits into
afar1:mainfrom
ericlitman:codex/thread-capture

Conversation

@ericlitman
Copy link
Copy Markdown

@ericlitman ericlitman commented May 7, 2026

Summary

  • add ft sync --threads to capture parent reply context and same-author continuations for bookmarked X posts
  • preserve reply media/link snapshots and include thread text in SQLite FTS search
  • keep thread enrichment resumable so rate limits do not block media fetches or index rebuilds

Why

X threads increasingly carry the paper, tool, model, or app links in replies to avoid link penalties on the primary post. Bookmark search needs to retain and search that reply context, not just the root post.

Notes

  • TweetDetail parsing is isolated in src/tweet-snapshots.ts to keep graphql-bookmarks.ts from growing another large endpoint-specific parser block.
  • Empty recognized TweetDetail timelines are stamped as permanent empty checks, while unparseable tweet-shaped responses remain transient so one parser miss does not poison the queue.
  • Existing DBs migrate by adding thread_text, backfilling it from stored thread JSON, and rebuilding FTS.

Validation

  • npm run build
  • npm test
  • git diff --check

Note

Medium Risk
Adds a new GraphQL-based thread-expansion sync path plus DB schema/FTS migrations; correctness and rate-limit handling depend on X endpoint stability and migration/backfill behavior.

Overview
Adds optional ft sync --threads support to capture reply-thread context (parent chain) and same-author continuations below bookmarked posts, persisting this data to both JSONL cache and the SQLite index.

Extends the DB schema and FTS indexing to store threadContext/threadBelow, compute a searchable thread_text field (with migration backfill + conditional FTS rebuild), and updates gap-fill to also refresh links when text is expanded.

Updates media fetching and CLI output to include thread tweet media/profile images and display thread sections, and factors TweetDetail parsing/link expansion into a new tweet-snapshots.ts utility used by the new thread sync flow.

Reviewed by Cursor Bugbot for commit 840c9de. Bugbot is set up for automated code reviews on this repo. Configure here.

@ericlitman ericlitman marked this pull request as ready for review May 7, 2026 14:17
Comment thread src/graphql-bookmarks.ts
Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 2 potential issues.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 840c9de. Configure here.

Comment thread src/graphql-bookmarks.ts
let below: ThreadTweetSnapshot[] = [];
if (cookies.csrfToken) {
const detail = await fetchTweetDetailViaGraphQL(record.tweetId, cookies.csrfToken, cookies.cookieHeader, { delayMs });
if (detail.status !== 'ok') return { context, below: [], status: detail.status };
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TweetDetail non-ok status discards fetched parent context

Medium Severity

When fetchTweetDetailViaGraphQL returns a non-ok status (especially 'empty', meaning the timeline was recognized but had no tweet results), expandThreadForRecord propagates that status even though parent context tweets were already successfully fetched. In syncThreads, the non-ok status causes the entire expansion to be treated as failed — the already-fetched context array in expanded.context is never written to the record. For 'empty' specifically — which just means "no continuation below" — this is classified as a permanent failure, stamping threadExpansionFailedAt and preventing any future retry. The successfully-fetched parent context is permanently lost.

Additional Locations (1)
Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 840c9de. Configure here.

Comment thread src/graphql-bookmarks.ts
url: record.url,
});
await persistProgress();
throw err;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unreachable error-handling code in syncThreads catch block

Low Severity

The failed++ and failures.push(...) after the if (THREAD_TRANSIENT_FAILURE_STATUSES.has(status)) guard in the catch block is dead code. The default status is 'error', which is a member of THREAD_TRANSIENT_FAILURE_STATUSES. Any throw from the inner try block either sets status to a transient value (and re-throws), or falls through without throwing. Any exception from the fetcher itself leaves status at 'error'. In all cases, the guard is true and the function re-throws before reaching the unreachable lines.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 840c9de. Configure here.

@cursor
Copy link
Copy Markdown

cursor Bot commented May 7, 2026

You have used all of your free Bugbot PR reviews.

To receive reviews on all of your PRs, visit the Cursor dashboard to activate Pro and start your 14-day free trial.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant