Skip to content

Add extensible content parser interface#8

Merged
pfefferle merged 6 commits intotrunkfrom
add/content-parser-interface
Mar 23, 2026
Merged

Add extensible content parser interface#8
pfefferle merged 6 commits intotrunkfrom
add/content-parser-interface

Conversation

@pfefferle
Copy link
Copy Markdown
Member

Proposed changes:

  • Add a Content_Parser interface for pluggable content parsing of the content union field in site.standard.document records.
  • Wire content parsing into the Document transformer with two filters: atmosphere_content_parser (swap/disable parser) and atmosphere_document_content (modify parsed output).
  • Add ?atproto preview endpoint on singular posts to inspect the document record JSON (requires edit_posts capability).
  • Always set the required site field in document records, falling back to the site URL when no publication record exists.

Split from #3 — this PR contains the generic extensible infrastructure. The Markpub parser implementation follows in a stacked PR.

Other information:

  • Have you written new tests for your changes, if applicable?

Testing instructions:

  • Verify ?atproto preview on a published post returns valid JSON (when logged in with edit_posts capability).
  • Verify ?atproto is not accessible when logged out.
  • Verify the content field is absent from the record (no parser registered by default).
  • Verify the site field is always present in document records.

Changelog entry

  • Automatically create a changelog entry from the details below.
Changelog Entry Details

Significance

  • Patch
  • Minor
  • Major

Type

  • Added - for new features
  • Changed - for changes in existing functionality
  • Deprecated - for soon-to-be removed features
  • Removed - for now removed features
  • Fixed - for any bug fixes
  • Security - in case of vulnerabilities

Message

Introduce a pluggable content parser system for the `content` union
field in site.standard.document records.

- Add Content_Parser interface for custom parser implementations.
- Wire content parsing into Document transformer with two filters:
  atmosphere_content_parser (swap/disable parser) and
  atmosphere_document_content (modify parsed output).
- Add ?atproto query param preview endpoint for inspecting the
  document record JSON (requires edit_posts capability).
- Always set the required `site` field in document records, falling
  back to the site URL when no publication record exists.
@pfefferle pfefferle added the enhancement New feature label Mar 23, 2026
@pfefferle pfefferle self-assigned this Mar 23, 2026
@pfefferle pfefferle added the enhancement New feature label Mar 23, 2026
@github-actions github-actions bot added [Feature] Content Parser Content parser for AT Protocol [Feature] Transformer AT Protocol record transformers labels Mar 23, 2026
@github-actions github-actions bot added the [Tests] Includes Tests PR includes test changes label Mar 23, 2026
@pfefferle pfefferle merged commit 659481b into trunk Mar 23, 2026
8 checks passed
@pfefferle pfefferle deleted the add/content-parser-interface branch March 23, 2026 17:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature [Feature] Content Parser Content parser for AT Protocol [Feature] Transformer AT Protocol record transformers [Tests] Includes Tests PR includes test changes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant