rust-lang · traviscross · Oct 9, 2025 · Feb 4, 2026 · Feb 25, 2026 · Feb 25, 2026
diff --git a/src/SUMMARY.md b/src/SUMMARY.md
@@ -7,6 +7,7 @@
 - [Lexical structure](lexical-structure.md)
     - [Input format](input-format.md)
     - [Shebang](shebang.md)
+    - [Frontmatter](frontmatter.md)
     - [Keywords](keywords.md)
     - [Identifiers](identifiers.md)
     - [Comments](comments.md)

diff --git a/src/frontmatter.md b/src/frontmatter.md
@@ -0,0 +1,67 @@
+r[frontmatter]
+# Frontmatter
+
+r[frontmatter.intro]
+Frontmatter is an optional section of metadata whose syntax allows external tools to read it without parsing Rust.
+
+> [!EXAMPLE]
+> <!-- ignore: test runner doesn't support frontmatter -->
+> ```rust,ignore
+> #!/bin/env cargo
+> --- cargo
+> package.edition = "2024"
+> ---
+>
+> fn main() {}
+> ```
+
+r[frontmatter.syntax]
+```grammar,lexer
+@root FRONTMATTER ->
+    WHITESPACE_ONLY_LINE*
+    !FRONTMATTER_INVALID
+    FRONTMATTER_MAIN
+
+WHITESPACE_ONLY_LINE -> (!LF WHITESPACE)* LF
+
+FRONTMATTER_INVALID -> (!LF WHITESPACE)+ `---` ^ ⊥
+
+FRONTMATTER_MAIN ->
+    `-`{n:3..=255} ^ FRONTMATTER_REST
+
+FRONTMATTER_REST ->
+    FRONTMATTER_FENCE_START
+    FRONTMATTER_LINE*
+    FRONTMATTER_FENCE_END
+
+FRONTMATTER_FENCE_START ->
+    MAYBE_INFOSTRING_OR_WS LF
+
+FRONTMATTER_FENCE_END ->
+    `-`{n} HORIZONTAL_WHITESPACE* ( LF | EOF )
+
+FRONTMATTER_LINE -> !`-`{n} ~[LF CR]* LF
+
+MAYBE_INFOSTRING_OR_WS ->
+    HORIZONTAL_WHITESPACE* INFOSTRING? HORIZONTAL_WHITESPACE*
+
+INFOSTRING -> (XID_Start | `_`) ( XID_Continue | `-` | `.` )*
+```
+
+r[frontmatter.position]
+Frontmatter may appear at the start of the file (after the optional [byte order mark]) or after a [shebang]. In either case, it may be preceded by [whitespace].
+
+r[frontmatter.fence]
+Frontmatter must start and end with a *fence*. Each fence must start at the beginning of a line. The opening fence must consist of at least 3 and no more than 255 hyphens (`-`). The closing fence must have exactly the same number of hyphens as the opening fence. The hyphens of either fence may be followed by [horizontal whitespace].
+
+r[frontmatter.infostring]
+The opening fence, after optional [horizontal whitespace], may be followed by an infostring that identifies the format or purpose of the body. An infostring may be followed by horizontal whitespace.
+
+r[frontmatter.body]
+No line in the body may start with a sequence of hyphens (`-`) equal to or longer than the opening fence. The body may not contain any carriage returns (that survive [CRLF normalization]).
+
+[byte order mark]: https://en.wikipedia.org/wiki/Byte_order_mark#UTF-8
+[CRLF normalization]: input.crlf
+[horizontal whitespace]: grammar-HORIZONTAL_WHITESPACE
+[shebang]: input-format.md#shebang-removal
+[whitespace]: whitespace.md
diff --git a/src/input-format.md b/src/input-format.md
@@ -44,6 +44,25 @@ r[input.shebang]
 r[input.shebang.removal]
 If a [shebang] is present, it is removed from the input sequence (and is therefore ignored).
 
+r[input.frontmatter]
+## Frontmatter removal
+
+r[input.frontmatter.removal]
+If the remaining input begins with a [frontmatter] fence, optionally preceded by lines containing only [whitespace], the [frontmatter] and any preceding whitespace are removed.
+
+For example, given the following file:
+
+<!-- ignore: test runner doesn't support frontmatter -->
+```rust,ignore
+--- cargo
+package.edition = "2024"
+---
+
+fn main() {}
+```
+
+The first three lines (the opening fence, body, and closing fence) would be removed, leaving an empty line followed by `fn main() {}`.
+
 r[input.tokenization]
 ## Tokenization
 
@@ -54,11 +73,12 @@ The resulting sequence of characters is then converted into tokens as described
 >
 > - Byte order mark removal.
 > - CRLF normalization.
-> - Shebang removal when invoked in an item context (as opposed to expression or statement contexts).
+> - Shebang and frontmatter removal when invoked in an item context (as opposed to expression or statement contexts).
 >
 > The [`include_str!`] and [`include_bytes!`] macros do not apply these transformations.
 
 [BYTE ORDER MARK]: https://en.wikipedia.org/wiki/Byte_order_mark#UTF-8
 [Crates and source files]: crates-and-source-files.md
+[frontmatter]: frontmatter.md
 [shebang]: shebang.md
 [whitespace]: whitespace.md
diff --git a/src/items/modules.md b/src/items/modules.md
@@ -123,7 +123,7 @@ r[items.mod.attributes]
 ## Attributes on modules
 
 r[items.mod.attributes.intro]
-Modules, like all items, accept outer attributes. They also accept inner attributes: either after `{` for a module with a body, or at the beginning of the source file, after the optional BOM and shebang.
+Modules, like all items, accept outer attributes. They also accept inner attributes: either after `{` for a module with a body, or at the beginning of the source file, after the optional BOM, shebang, and frontmatter.
 
 r[items.mod.attributes.supported]
 The built-in attributes that have meaning on a module are [`cfg`], [`deprecated`], [`doc`], [the lint check attributes], [`path`], and [`no_implicit_prelude`]. Modules also accept macro attributes.

diff --git a/src/notation.md b/src/notation.md
@@ -45,6 +45,18 @@ Mizushima et al. introduced [cut operators][cut operator paper] to parsing expre
 
 The hard cut operator is necessary because some tokens in Rust begin with a prefix that is itself a valid token. For example, `c"` begins a C string literal, but `c` alone is a valid identifier. Without the cut, if `c"\0"` failed to lex as a C string literal (because null bytes are not allowed in C strings), the parser could backtrack and lex it as two tokens: the identifier `c` and the string literal `"\0"`. The [cut after `c"`] prevents this --- once the opening delimiter is recognized, the parser cannot go back. The same reasoning applies to [byte literals], [byte string literals], [raw string literals], and other literals with prefixes that are themselves valid tokens.
 
+r[notation.grammar.bottom]
+### The bottom rule
+
+In logic, ⊥ (*bottom*) represents *absurdity* --- a proposition that is always false. In type theory, it is the *empty type* --- a type with no inhabitants. The grammar borrows both senses: the rule ⊥ matches nothing --- not any character, not even the end of input.
+
+```grammar,notation
+// The bottom rule does not match anything.
+⊥ -> !(CHAR | EOF)
+```
+
+Placed after a [hard cut operator], ⊥ makes a rule fail unconditionally once the parser has committed past the cut. This gives the grammar a way to express recognition without acceptance. The parser identifies the input, commits so that no other alternative can be tried, and then rejects it. In the frontmatter grammar, for example, [FRONTMATTER_INVALID] uses `^ ⊥` to recognize an opening fence preceded by whitespace on the same line --- input that is close enough to frontmatter to rule out other interpretations but is not valid.
+
 r[notation.grammar.string-tables]
 ### String table productions
 

diff --git a/src/whitespace.md b/src/whitespace.md
@@ -16,6 +16,10 @@ WHITESPACE ->
     | U+2028 // Line separator
     | U+2029 // Paragraph separator
 
+HORIZONTAL_WHITESPACE ->
+      U+0009 // Horizontal tab, `'\t'`
+    | U+0020 // Space, `' '`
+
 TAB -> U+0009 // Horizontal tab, `'\t'`
 
 LF -> U+000A  // Line feed, `'\n'`
@@ -26,10 +30,14 @@ CR -> U+000D  // Carriage return, `'\r'`
 r[lex.whitespace.intro]
 Whitespace is any non-empty string containing only characters that have the [`Pattern_White_Space`] Unicode property.
 
+r[lex.whitespace.horizontal]
+[HORIZONTAL_WHITESPACE] is the horizontal space subset of [`Pattern_White_Space`] as categorized by [UAX #31, Section 4.1][uax31-4.1].
+
 r[lex.whitespace.token-sep]
 Rust is a "free-form" language, meaning that all forms of whitespace serve only to separate _tokens_ in the grammar, and have no semantic significance.
 
 r[lex.whitespace.replacement]
 A Rust program has identical meaning if each whitespace element is replaced with any other legal whitespace element, such as a single space character.
 
 [`Pattern_White_Space`]: https://www.unicode.org/reports/tr31/
+[uax31-4.1]: https://www.unicode.org/reports/tr31/#Whitespace_and_Syntax