Skip to content

skills.sh implementation#18

Open
cole-ramer wants to merge 3 commits intostagingfrom
feak/skills.sh
Open

skills.sh implementation#18
cole-ramer wants to merge 3 commits intostagingfrom
feak/skills.sh

Conversation

@cole-ramer
Copy link
Copy Markdown
Collaborator

PR: Skills.sh Catalog Integration

Branch: feak/skills.shstaging

Summary

Integrates the full skills.sh catalog (~50,000 skills) into the platform. Users can browse, search, and add skills to their harnesses. Selected skills are passed to the LLM during chat as a manifest, and a custom get_skill_content tool lets the model retrieve full skill markdown on demand.

What Changed

New: Skills Catalog Proxy (FastAPI)

  • packages/fastapi/app/routes/skills.py — Three new endpoints (/api/skills, /api/skills/search, /api/skills/detail) that proxy the HuggingFace Dataset Server API. List and search responses strip the detail field to keep payloads small for the browser.
  • Registered in main.py under /api/skills.

New: Convex Skills Backend

  • packages/convex-backend/convex/skills.ts — Queries (getByName, getByNames) and an ensureSkillDetails action that lazily fetches and stores full skill markdown from HuggingFace into a new skillDetails table. Skill detail is only persisted when a harness containing that skill is saved, avoiding unnecessary storage.
  • packages/convex-backend/convex/schema.ts — Added skillDetails table (name, skillName, description, detail, code) with a by_name index. Changed harnesses.skills from v.array(v.string()) to v.array(v.object({ name, description })) so descriptions travel with the harness and don't require extra queries at chat time.

Updated: Harness Mutations

  • packages/convex-backend/convex/harnesses.tscreate, update, and duplicate mutations updated for the new skills shape. Includes a one-time migrateSkillsToObjects internal mutation that was used to migrate existing documents from the old string[] format.
  • packages/convex-backend/convex/seed.ts — Seed data updated to use { name, description } skill objects.

New: Frontend Skills Components

  • apps/web/src/components/skills-browser.tsx — Paginated, searchable skill catalog browser using TanStack Query. Checkbox-style selection matching the MCP preset grid pattern.
  • apps/web/src/components/recommended-skills-grid.tsx — Curated grid of 9 recommended skills (vercel-react-best-practices, brainstorming, browser-automation, pdf-vision-reader, systematic-debugging, xlsx, code-review, database-schema-design, git-best-practices).
  • apps/web/src/lib/skills.ts — Shared SkillEntry type and RECOMMENDED_SKILLS data.
  • Each skill entry displays the skill name, repo path (e.g. 0xbigboss/claude-code), and description.

Updated: Harness Editor ($harnessId.tsx)

  • Added "Recommended Skills" grid and "Added Skills" checklist above a "Browse All Skills" dialog.
  • Skill add/remove is managed in local React state (like name, model, MCPs) and only persisted on save.
  • After save, ensureSkillDetails fires in the background to cache markdown for newly added skills.
  • Removed the separate skillDetails Convex query for display — descriptions now come directly from the stored skill objects, eliminating flicker.

Updated: Onboarding (onboarding.tsx)

  • Skills step shows the recommended grid and full catalog browser.
  • Selected skills saved as { name, description } objects on harness creation.
  • ensureSkillDetails called after harness creation to backfill markdown.

Updated: Chat Page (chat/index.tsx)

  • New SkillsStatus component in the chat header showing "X Skills" next to the MCP status indicator. Clicking it reveals a dropdown listing all skills in the active harness with their display name and description.
  • Chat request payload now sends skills: Array<{ name, description }> to FastAPI.

Updated: Chat Loop (FastAPI chat.py)

  • When skills are present, a system message is prepended with the skills manifest (name + description for each skill).
  • A get_skill_content tool definition is added to the tools list sent to the LLM.
  • In the agentic tool execution loop, calls to get_skill_content are intercepted and handled locally — the backend queries Convex for the cached markdown, falling back to HuggingFace if not yet cached.
  • No extra Convex query at chat time for descriptions — they come directly from the request payload.

Updated: FastAPI Models (models.py)

  • Added SkillRef model (name + description).
  • HarnessConfig.skills changed from list[str] to list[SkillRef].

Updated: Convex Service (services/convex.py)

  • Added query_convex helper to execute Convex queries via the HTTP API using the deploy key. Used by the chat route to fetch skill detail markdown.

Minor Fixes

  • message-attachments.tsx — Replaced ESLint ignore comment with biome-ignore for useMediaCaption.
  • chat/index.tsx — Fixed forEach return value lint (useIterableCallbackReturn) by switching to for...of. Added biome-ignore for noStaticElementInteractions on the drag-and-drop zone.
  • Import ordering fixed across all modified files to satisfy biome's organizeImports rule.

Files Changed (22 files, +1204 / -136)

Area Files
FastAPI routes/skills.py (new), routes/chat.py, models.py, services/convex.py, main.py
Convex schema.ts, harnesses.ts, skills.ts (new), seed.ts
Frontend Components skills-browser.tsx (new), recommended-skills-grid.tsx (new)
Frontend Lib skills.ts (new), use-chat-stream.ts, models.ts, multimodal.ts
Frontend Routes chat/index.tsx, harnesses/$harnessId.tsx, onboarding.tsx
Other attachment-chip.tsx, message-attachments.tsx, use-file-attachments.ts

@DIodide
Copy link
Copy Markdown
Owner

DIodide commented Mar 24, 2026

@claude review this, propose change of skills routes from FastAPI to TanStack server funcs

@claude
Copy link
Copy Markdown

claude bot commented Mar 24, 2026

Claude finished @DIodide's task in 2m 17s —— View job


PR Review + TanStack Migration Proposal

  • Gather context from changed files
  • Review PR code
  • Propose TanStack server function migration

Code Review

Overall this is a solid integration. The pattern of storing { name, description } on the harness (so descriptions don't require extra queries at chat time) is a good design decision. A few issues worth addressing:

Bugs / Correctness

  1. /api/skills/detail — unreliable exact-match lookup (routes/skills.py:89–108)
    The endpoint searches by name and then filters the top-5 results for an exact match. If the exact skill name isn't in the top 5 HuggingFace search results (e.g. due to fuzzy ranking), the endpoint silently returns 404. The HuggingFace /rows endpoint supports filtering by field value which would be more reliable for exact lookups.

  2. /api/skills/detail appears to have no callersensureSkillDetails in convex/skills.ts fetches HuggingFace directly (line 84), and _handle_get_skill_content in chat.py also calls HuggingFace directly (line 81). This route is added but unused, creating maintenance surface for nothing. Fix this →

  3. migrateSkillsToObjects left in harnesses.ts — one-time migration mutations should be removed once confirmed applied, or they become confusing dead code that someone might accidentally re-run. Fix this →

Architecture

  1. Circular type dependencyapps/web/src/lib/skills.ts:1 imports SkillRow from ../components/skills-browser. Lib files should not import from component files. SkillRow (and SkillsResponse) should live in skills.ts and be imported by the component. Fix this →

  2. No auth on /api/skills routes — The list/search endpoints proxy public HuggingFace data with no authentication check. Anyone can use your FastAPI server as a free HuggingFace proxy. Worth adding the same get_current_user dependency the chat route uses, or at minimum a simple API key check.

Minor

  1. RECOMMENDED_SKILLS descriptions are hardcoded (skills.ts:14–104) and can drift from the actual HuggingFace dataset descriptions. Consider pulling them from the catalog on first load or at build time.

  2. Hardcoded ngrok URL in main.py:49"https://aab7-140-180-240-231.ngrok-free.app" should not be committed; it should be in env config.


Proposed Migration: Skills Routes → TanStack Server Functions

Why migrate?
The /api/skills and /api/skills/search routes are pure HuggingFace proxies with no business logic. Moving them to TanStack server functions:

  • Removes a cross-origin round-trip (browser → FastAPI → HuggingFace → FastAPI → browser)
  • Eliminates the CORS dependency for this feature
  • Removes VITE_FASTAPI_URL coupling from the skills browser (catalog browsing doesn't need the AI backend running)
  • Adds full TypeScript type safety end-to-end
  • Keeps the pattern consistent with fetchClerkAuth in __root.tsx

What stays in FastAPI: The /api/skills/detail route can be deleted entirely (see issue #2 above). The _handle_get_skill_content function in chat.py already calls HuggingFace directly.


Step 1 — Create apps/web/src/lib/skills-api.ts

import { createServerFn } from "@tanstack/react-start";

const HF_BASE = "https://datasets-server.huggingface.co";
const HF_COMMON = "dataset=tickleliu/all-skills-from-skills-sh&config=default&split=train";

function stripDetail(entry: Record<string, unknown>) {
  const r = (entry.row ?? entry) as Record<string, unknown>;
  return {
    name: (r.name as string) ?? "",
    skill_name: (r.skill_name as string) ?? "",
    description: (r.description as string) ?? "",
    code: (r.code as string) ?? "",
  };
}

export const listSkillsFn = createServerFn({ method: "GET" })
  .validator((data: { offset: number; limit: number }) => data)
  .handler(async ({ data }) => {
    const { offset, limit } = data;
    const resp = await fetch(
      `${HF_BASE}/rows?${HF_COMMON}&offset=${offset}&length=${limit}`,
    );
    if (!resp.ok) throw new Error("Failed to fetch skills from catalog");
    const json = await resp.json();
    return {
      rows: (json.rows as Record<string, unknown>[]).map(stripDetail),
      total: (json.num_rows_total as number) ?? 0,
      offset,
      limit,
    };
  });

export const searchSkillsFn = createServerFn({ method: "GET" })
  .validator((data: { q: string; offset: number; limit: number }) => data)
  .handler(async ({ data }) => {
    const { q, offset, limit } = data;
    const encoded = encodeURIComponent(q);
    const resp = await fetch(
      `${HF_BASE}/search?${HF_COMMON}&query=${encoded}&offset=${offset}&length=${limit}`,
    );
    if (!resp.ok) throw new Error("Failed to search skills catalog");
    const json = await resp.json();
    return {
      rows: (json.rows as Record<string, unknown>[]).map(stripDetail),
      total: (json.num_rows_total as number) ?? 0,
      offset,
      limit,
    };
  });

Step 2 — Update skills-browser.tsx queryFn

Replace the direct fetch call with the server functions:

// Before (skills-browser.tsx:57-63)
queryFn: async () => {
  const endpoint = debouncedSearch
    ? `${API_URL}/api/skills/search?q=${encodeURIComponent(debouncedSearch)}&offset=${offset}&limit=${PAGE_SIZE}`
    : `${API_URL}/api/skills?offset=${offset}&limit=${PAGE_SIZE}`;
  const res = await fetch(endpoint);
  if (!res.ok) throw new Error("Failed to fetch skills");
  return res.json();
},

// After
queryFn: () =>
  debouncedSearch
    ? searchSkillsFn({ data: { q: debouncedSearch, offset, limit: PAGE_SIZE } })
    : listSkillsFn({ data: { offset, limit: PAGE_SIZE } }),

Then remove the API_URL constant and env import from skills-browser.tsx.

Step 3 — Remove the FastAPI skills route

# packages/fastapi/app/main.py
-from app.routes import chat, health, mcp_health, mcp_oauth, skills
+from app.routes import chat, health, mcp_health, mcp_oauth
 
-app.include_router(skills.router, prefix="/api/skills", tags=["skills"])

Delete packages/fastapi/app/routes/skills.py.


Trade-off to consider: TanStack server functions run on the Node.js SSR server, so catalog fetches will originate from your server IP rather than the browser. If HuggingFace has per-IP rate limits this concentrates requests. At 50k skills with pagination this is unlikely to be an issue in practice, but worth keeping in mind. If rate limiting becomes a concern, a light caching layer (e.g. a 60-second in-memory cache keyed on offset+limit+query) in the server function would address it cleanly without needing Redis.

@DIodide
Copy link
Copy Markdown
Owner

DIodide commented Mar 25, 2026

@claude review this

@claude
Copy link
Copy Markdown

claude bot commented Mar 25, 2026

Claude finished @DIodide's task in 2m 46s —— View job


PR Review: skills.sh Catalog Integration

  • Gather context from changed files
  • Review FastAPI routes & chat integration
  • Review Convex backend
  • Review frontend components
  • Review types & architecture

Note: The TanStack migration proposed in the previous review is already applied in commit 0ed7542skills-api.ts uses createServerFn, skills-browser.tsx consumes those functions, and the FastAPI skills.py route was never committed. The migrateSkillsToObjects mutation and the circular type dependency are also resolved. The review below focuses on what remains.


Bugs / Correctness

1. ensureSkillDetails writes empty detail on HuggingFace miss, then _handle_get_skill_content re-fetches HuggingFace on every chat request (skills.ts:99–110, chat.py:75–76)

When the HuggingFace top-5 search doesn't contain an exact name match, ensureSkillDetails stores detail = "". On the next chat, _handle_get_skill_content calls query_convex which returns the stored row, but result.get("detail") is falsy for "" → falls through to the HuggingFace fallback again on every single request. checkExists in ensureSkillDetails prevents re-fetching background fills, but the chat-time path has no such guard.

Fix: either skip insertion when detail is empty (so chat falls back every time but the Convex cache only stores real data), or use a sentinel like detail = "__not_found__" to distinguish "tried but absent" from "not yet cached". Fix this →

2. Chat-time HuggingFace fallback has the same exact-match fragility (chat.py:88–94)

_handle_get_skill_content searches HuggingFace for the skill name and then filters for an exact match in the top-5 results — the same logic the previous review flagged for the removed /api/skills/detail endpoint. If the skill name isn't in the top 5 ranked results, this silently returns "Could not retrieve content". Fix this →

3. Hardcoded ngrok URL committed in main.py:50

"https://aab7-140-180-240-231.ngrok-free.app",

This URL will expire and is a dev artifact. It should either be in env config or removed. Also 127.0.0.1:57177 and 127.0.0.1:57609 (lines 49, 51) look like ephemeral dev ports — fine locally but shouldn't be hardcoded. Fix this →


Architecture

4. insertSkillDetail never updates stale data (skills.ts:46–48)

if (existing) return existing._id;  // always skips if any record exists

If a skill's detail or description changes in the HuggingFace dataset, the cached version is never refreshed. This is probably fine for now, but worth a comment noting it's an intentional write-once cache.

5. SkillsStatus re-implements click-outside instead of using DropdownMenu (chat/index.tsx:1622–1661)

A custom useRef + document.addEventListener("mousedown") + AnimatePresence panel is built, while the file already imports DropdownMenu, DropdownMenuContent, DropdownMenuItem, DropdownMenuTrigger. This duplicates work, misses keyboard navigation (Escape to close), and breaks the app's component consistency. Fix this →

6. SkillEntry type inlined in SkillsStatus signature (chat/index.tsx:1620)

skills: Array<{ name: string; description: string }>;

The shared SkillEntry type from lib/skills should be used here instead of the inline shape. Minor but keeps types in sync if SkillEntry ever grows.


Minor

7. getByNames does N separate DB queries (skills.ts:21–29) — Promise.all over individual ctx.db.query calls for each name. For large skill sets this adds N round-trips. Not a practical concern now, but consider a single query with a filter if this ever serves large harnesses.

8. ensureSkillDetails processes skills sequentially (skills.ts:73) — the for...of loop awaits each HuggingFace call one by one. Using Promise.all would parallelize the fetches. Not a hot path since it's fire-and-forget, but mentioning for completeness.


What's Good

  • The TanStack migration is clean — skills-api.ts server functions replace the FastAPI proxy entirely, skills-browser.tsx has zero FastAPI coupling, and main.py has no skills router.
  • Storing { name, description } on the harness eliminates the extra Convex query at chat time — good design.
  • allowed_skill_names guard in _handle_get_skill_content (chat.py:68–69) correctly prevents prompt injection via crafted skill names.
  • Types are properly layered: SkillRow/SkillsResponse in lib/skills.ts, SkillEntry for harness storage, SkillRef in Python models — all consistent.
  • migrateSkillsToObjects removed, circular lib→component type dependency fixed from the previous review.

@DIodide DIodide self-requested a review March 25, 2026 05:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants