Skip to content

feat(markdown_parser): GFM HTML safe-list and footnotes in .md preview#79

Merged
BunsDev merged 18 commits into
mainfrom
cast/md-html-support
May 20, 2026
Merged

feat(markdown_parser): GFM HTML safe-list and footnotes in .md preview#79
BunsDev merged 18 commits into
mainfrom
cast/md-html-support

Conversation

@BunsDev
Copy link
Copy Markdown
Member

@BunsDev BunsDev commented May 20, 2026

Summary

  • CodeView's .md Rendered preview now renders GFM raw HTML on a GitHub-style safe-list: <details>/<summary> (always-expanded with a glyph in v1), <kbd>, <sub>/<sup>, <del>/<ins>/<u>, raw <table>, <img>, plus inline phrasing tags. Tags outside the safe-list pass through as literal text; <script>/<style>/<iframe>/etc. strip.
  • GFM footnotes ([^id] references + [^id]: defn definitions) resolve: references become numbered superscript-style hyperlinks (#fn-id); a horizontal rule + numbered list appears at the bottom with a back-link (#fnref-id); unused definitions are dropped.
  • Mermaid (already wired) is verified via the included smoke fixture — no code change.

Spec lives at specs/castcodes-md-gfm-html/. All work is parser-only (crates/markdown_parser/); no editor / renderer / app changes. The bespoke <u> inline parser is removed in favor of routing through the same html5ever-backed helper. 161 unit tests pass.

Test plan

  • Unit: `cargo test -p markdown_parser` — expect 161 passed, 0 failed.
  • Downstream compile: `cargo check -p warp-app` — expect clean (no errors).
  • Manual smoke test in CodeView (the deferred Phase 5.2 step):
    • Open `crates/markdown_parser/test-fixtures/gfm-smoketest.md` in the app.
    • Click the Rendered toggle in the markdown file header.
    • Verify each section renders as described in `specs/castcodes-md-gfm-html/PLAN-01-html-and-footnotes.md` §Task 5.2 (inline kbd/sub/sup/u/del/mark, block details with disclosure glyph, raw HTML table, mermaid SVG, footnotes section with back-link, `<script>` stripped, existing GFM still works).
  • Lint: `cargo clippy -p markdown_parser --no-deps` — clean.

Deferred follow-ups (from final code review)

  • Footnote rewriter only inspects each fragment in isolation; an inside-bold reference like `claim[^x]` may not match if the inline emphasis parser splits the bracket fragments differently than expected. Add coverage + fix if it reproduces.
  • Footnote definition lines aren't gated on a preceding blank line, so an indented continuation can absorb the line following a mid-paragraph match.
  • Footnote-reference fragments inside tables round-trip to `1` rather than `[^x]` via `inline_to_markdown` (TECH §7 already calls out the round-trip risk class).
  • TECH §6 listed end-to-end `markdown_parser_tests` for several cases that ended up in `html_parser_tests` (helper-level) instead — add the end-to-end variants.

Notes

  • 14 signed commits (every commit on this branch shows `Good "git" signature`).
  • AI-attribution guard passes (`./script/check_ai_attribution`).
  • No feature flag added — the change is a refinement of existing default-on markdown rendering. Previous behavior (raw HTML as literal text, `[^id]` as literal text) was a bug, not a feature to toggle.

BunsDev added 14 commits May 20, 2026 10:05
Add parse_html_block_lines, parse_html_inline_fragments, and find_body
as pub(crate) helpers above parse_html for use by the upcoming markdown
GFM HTML block/inline dispatch. No behavior change to parse_html itself.
Wire a two-pass pipeline into parse_markdown_impl: extract_definitions
strips [^id]: blocks before parsing, rewrite_references rewrites inline
[^id] tokens to numbered hyperlinks, and append_section appends the
rendered footnote list with back-reference links.
Copilot AI review requested due to automatic review settings May 20, 2026 16:02
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR extends crates/markdown_parser so CodeView’s rendered .md preview supports (1) a GitHub-style safe-listed subset of raw HTML and (2) GFM footnotes, while keeping the public parser API stable and reusing the existing html5ever-based HTML pipeline.

Changes:

  • Add a GFM HTML span lexer + wire block/inline HTML dispatch into the markdown parser.
  • Add a footnote extract → rewrite → append pipeline around the existing markdown parse.
  • Extend the HTML parser to render <details>/<summary>, <img>, raw <table>, and additional phrasing tags; add fixtures/spec docs and unit tests.

Reviewed changes

Copilot reviewed 14 out of 14 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
crates/markdown_parser/src/markdown_parser.rs Runs footnote pre/post passes; adds block HTML dispatch loop; adds inline HTML token handling.
crates/markdown_parser/src/markdown_parser_tests.rs Adds unit coverage for block/inline HTML behavior and stripping.
crates/markdown_parser/src/html_parser.rs Exposes pub(crate) HTML fragment helpers; adds <details>, <img>, raw <table>, and more phrasing-tag styling.
crates/markdown_parser/src/html_parser_tests.rs Adds regression coverage for fragment parsing + new HTML constructs.
crates/markdown_parser/src/gfm_html.rs Introduces safe/strip tag lists and a span lexer for HTML in markdown.
crates/markdown_parser/src/gfm_html_tests.rs Adds lexer unit tests (nesting, attributes, comments, malformed tags).
crates/markdown_parser/src/footnotes.rs Implements footnote definition extraction, reference rewriting, and section appending.
crates/markdown_parser/src/footnotes_tests.rs Adds unit coverage for extraction and end-to-end footnote rendering behavior.
crates/markdown_parser/src/lib.rs Registers new private modules (footnotes, gfm_html).
crates/markdown_parser/test-fixtures/gfm-smoketest.md Adds a manual smoke fixture for in-app visual verification.
DESIGN-CHANGES.md Documents the user-visible preview behavior changes.
specs/castcodes-md-gfm-html/TECH.md Adds the technical design/spec for HTML safe-list + footnotes implementation.
specs/castcodes-md-gfm-html/PRODUCT.md Adds product requirements and expected rendering behavior.
specs/castcodes-md-gfm-html/PLAN-01-html-and-footnotes.md Adds an implementation plan and verification checklist.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread crates/markdown_parser/src/footnotes.rs Outdated
Comment thread crates/markdown_parser/src/markdown_parser.rs
Comment thread crates/markdown_parser/src/gfm_html.rs Outdated
Comment thread crates/markdown_parser/src/html_parser.rs Outdated
BunsDev and others added 3 commits May 20, 2026 13:39
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
@BunsDev BunsDev merged commit f1ae09d into main May 20, 2026
10 checks passed
@BunsDev BunsDev deleted the cast/md-html-support branch May 20, 2026 19:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants