feat(markdown_parser): GFM HTML safe-list and footnotes in .md preview#79
Merged
Conversation
Add parse_html_block_lines, parse_html_inline_fragments, and find_body as pub(crate) helpers above parse_html for use by the upcoming markdown GFM HTML block/inline dispatch. No behavior change to parse_html itself.
Wire a two-pass pipeline into parse_markdown_impl: extract_definitions strips [^id]: blocks before parsing, rewrite_references rewrites inline [^id] tokens to numbered hyperlinks, and append_section appends the rendered footnote list with back-reference links.
Contributor
There was a problem hiding this comment.
Pull request overview
This PR extends crates/markdown_parser so CodeView’s rendered .md preview supports (1) a GitHub-style safe-listed subset of raw HTML and (2) GFM footnotes, while keeping the public parser API stable and reusing the existing html5ever-based HTML pipeline.
Changes:
- Add a GFM HTML span lexer + wire block/inline HTML dispatch into the markdown parser.
- Add a footnote extract → rewrite → append pipeline around the existing markdown parse.
- Extend the HTML parser to render
<details>/<summary>,<img>, raw<table>, and additional phrasing tags; add fixtures/spec docs and unit tests.
Reviewed changes
Copilot reviewed 14 out of 14 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
crates/markdown_parser/src/markdown_parser.rs |
Runs footnote pre/post passes; adds block HTML dispatch loop; adds inline HTML token handling. |
crates/markdown_parser/src/markdown_parser_tests.rs |
Adds unit coverage for block/inline HTML behavior and stripping. |
crates/markdown_parser/src/html_parser.rs |
Exposes pub(crate) HTML fragment helpers; adds <details>, <img>, raw <table>, and more phrasing-tag styling. |
crates/markdown_parser/src/html_parser_tests.rs |
Adds regression coverage for fragment parsing + new HTML constructs. |
crates/markdown_parser/src/gfm_html.rs |
Introduces safe/strip tag lists and a span lexer for HTML in markdown. |
crates/markdown_parser/src/gfm_html_tests.rs |
Adds lexer unit tests (nesting, attributes, comments, malformed tags). |
crates/markdown_parser/src/footnotes.rs |
Implements footnote definition extraction, reference rewriting, and section appending. |
crates/markdown_parser/src/footnotes_tests.rs |
Adds unit coverage for extraction and end-to-end footnote rendering behavior. |
crates/markdown_parser/src/lib.rs |
Registers new private modules (footnotes, gfm_html). |
crates/markdown_parser/test-fixtures/gfm-smoketest.md |
Adds a manual smoke fixture for in-app visual verification. |
DESIGN-CHANGES.md |
Documents the user-visible preview behavior changes. |
specs/castcodes-md-gfm-html/TECH.md |
Adds the technical design/spec for HTML safe-list + footnotes implementation. |
specs/castcodes-md-gfm-html/PRODUCT.md |
Adds product requirements and expected rendering behavior. |
specs/castcodes-md-gfm-html/PLAN-01-html-and-footnotes.md |
Adds an implementation plan and verification checklist. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Copilot stopped work on behalf of
BunsDev due to an error
May 20, 2026 18:52
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
.mdRendered preview now renders GFM raw HTML on a GitHub-style safe-list:<details>/<summary>(always-expanded with a▾glyph in v1),<kbd>,<sub>/<sup>,<del>/<ins>/<u>, raw<table>,<img>, plus inline phrasing tags. Tags outside the safe-list pass through as literal text;<script>/<style>/<iframe>/etc. strip.[^id]references +[^id]: defndefinitions) resolve: references become numbered superscript-style hyperlinks (#fn-id); a horizontal rule + numbered list appears at the bottom with a back-link (#fnref-id); unused definitions are dropped.Spec lives at
specs/castcodes-md-gfm-html/. All work is parser-only (crates/markdown_parser/); no editor / renderer / app changes. The bespoke<u>inline parser is removed in favor of routing through the same html5ever-backed helper. 161 unit tests pass.Test plan
Deferred follow-ups (from final code review)
Notes