Skip to content

feat(docs): add llms.txt page#3629

Merged
EtienneM merged 1 commit intomasterfrom
feat/llms.txt
May 5, 2026
Merged

feat(docs): add llms.txt page#3629
EtienneM merged 1 commit intomasterfrom
feat/llms.txt

Conversation

@benjaminach
Copy link
Copy Markdown
Contributor

@benjaminach benjaminach commented Mar 26, 2026

Summary

This PR introduces a generated llms.txt feed for doc.scalingo.com and improves LLM routing quality by adding optional, high-signal page descriptions.

Changes

  • Added src/llms.txt (Jekyll-generated) to expose a machine-readable list of documentation pages.
  • Updated _config.yml to include llms.txt in the generated site.
  • Filtered out non-target content from the feed (e.g. changelog entries, directory/index-like pages, explicitly excluded pages).
  • Added a new Codex skill: .codex/skills/scalingo-generate-description/SKILL.md.
  • Added its OpenAI agent config: .codex/skills/scalingo-generate-description/agents/openai.yaml.
  • Standardized description formatting with double-quoted YAML values.

Why

description fields are optional, but they significantly help LLMs understand page intent and route queries to the right documentation page when scanning llms.txt.

Impact

  • New machine-readable endpoint: /llms.txt
  • Better semantic retrieval/routing for doc-focused LLM workflows
  • No expected user-facing behavior changes in normal site navigation

Validation

  • Verified Jekyll generation includes llms.txt.
  • Checked feed formatting and URL output.
  • Checked YAML front matter validity on updated pages.

@benjaminach benjaminach marked this pull request as draft March 27, 2026 11:03
@benjaminach benjaminach changed the title add files to generates an llms.txt file feat(docs): add llms.txt feed and optional description generation for better LLM routing Mar 27, 2026
Comment thread src/llms.txt Outdated
Comment thread src/llms.txt Outdated
<script defer data-domain="scalingo.com" event-app="documentation" data-api="https://scalingo.com/sc-analytics/event" src="{{ 'assets/analytics.js' | esbuild_asset_path }}"></script>
<link rel="stylesheet" href="https://use.typekit.net/ajl3atf.css">
{% if jekyll.environment == "production" %}
<script async src="https://analytics.scalingo.com/script.js"></script>
Copy link
Copy Markdown

@semgrep-code-scalingo semgrep-code-scalingo Bot Apr 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This tag is missing an 'integrity' subresource integrity attribute. The 'integrity' attribute allows for the browser to verify that externally hosted files (for example from a CDN) are delivered without unexpected manipulation. Without this attribute, if an attacker can modify the externally hosted resource, this could lead to XSS and other types of attacks. To prevent this, include the base64-encoded cryptographic hash of the resource (file) you’re telling the browser to fetch in the 'integrity' attribute for all externally hosted files.

🧁 Removed in commit faa8a24 🧁

@Frzk Frzk force-pushed the feat/llms.txt branch 2 times, most recently from 5213261 to faa8a24 Compare April 16, 2026 13:14
@semgrep-code-scalingo
Copy link
Copy Markdown

Legal Risk

The following dependencies were released under a license that
has been flagged by your organization for consideration.

Recommendation

While merging is not directly blocked, it's best to pause and consider what it means to use this license before continuing. If you are unsure, reach out to your security team or Semgrep admin to address this issue.

MPL-2.0

Comment thread src/changelog/feed.xml
@@ -1,6 +1,5 @@
---
layout: null
page_exclude: true
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: what does this change, what is the impact?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hum, I thought it was safe to remove, but now I am not sure, since these pages are now included in https://documentation-service-pr3629.osc-fr1.scalingo.io/sitemap.xml. I will check.

@EtienneM EtienneM changed the title feat(docs): add llms.txt feed and optional description generation for better LLM routing feat(docs): add llms.txt page May 5, 2026
@EtienneM EtienneM merged commit a14015d into master May 5, 2026
4 checks passed
@EtienneM EtienneM deleted the feat/llms.txt branch May 5, 2026 14:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants