Is your feature request related to a problem?
Currently, PPL autocomplete in OpenSearch Dashboards relies on a frontend-maintained copy of the PPL grammar. With the nature of frequently evolving of PPL grammar, this leads to:
- Version drift: The OpenSearch Dashboards can fall out of sync with the backend grammar, producing incorrect or stale autocomplete suggestions.
- No single source of truth: There is no authoritative, runtime-accessible source for the PPL grammar that downstream consumers can rely on.
What solution would you like?
Introduce a new REST API (GET /_plugins/_ppl/_grammar) that serves a grammar bundle — a JSON payload containing all ANTLR grammar metadata needed to reconstruct a functional PPL lexer and parser on the OpenSearch Dashboards side at runtime. The bundle includes:
- Serialized 16-bit integer ATN arrays (lexer and parser)
- Token vocabularies (literalNames, symbolicNames)
- Rule names, channel names, and mode names
- A bundleVersion field and a deterministic grammarHash (SHA-256)
What alternatives have you considered?
- Serve raw .g4 grammar files for the frontend to compile
- Serve entire parser from PPL backend
- Serve entire autocomplete service on PPL backend
Do you have any additional context?
How grammar changes are reflected in the bundle:
When the PPL grammar (.g4 files) is modified and the plugin is rebuilt, the changes propagate through the bundle as follows:
- ATN arrays (lexerSerializedATN, parserSerializedATN): ANTLR regenerates the lexer and parser classes at build time, producing new ATN state machines. The serialized integer arrays will differ to reflect new/modified/removed states and transitions.
- Rule names (lexerRuleNames, parserRuleNames): Any added, removed, or renamed lexer/parser rules are reflected in declaration order. The startRuleIndex may shift if rules are added before the entry rule.
- Token vocabulary (literalNames, symbolicNames): New keywords, operators, or token types appear in the vocabulary arrays. Removed tokens are no longer present. Changes to token literal text (e.g., renaming a keyword) update literalNames.
- Channel and mode names (channelNames, modeNames): Updated only if the grammar adds or modifies lexer channels or modes (uncommon for typical grammar changes).
- grammarHash: The SHA-256 is computed from the ATN data and ANTLR version, so any grammar change — no matter how small — produces a new hash. Clients can compare this hash against their cached value to detect whether the bundle has changed.
In all cases, these changes only take effect when the plugin JAR is rebuilt and deployed. The bundle is immutable for the lifetime of a given plugin build. On node restart with a new plugin version, the first request triggers a fresh bundle build with the updated grammar, and the new bundle is cached for the node's lifetime.
Grammar API schema management:
The bundleVersion field in the response body governs the schema of the grammar bundle itself (not the grammar content). This is how we manage the API contract moving forward:
- Non-breaking changes (e.g., adding a new optional field): The bundleVersion remains unchanged. Clients should tolerate unknown fields, so new optional fields can be added without requiring client updates.
- Breaking changes (e.g., removing a field, changing a field's type or semantics): The bundleVersion is incremented (e.g., "1.0" → "2.0"). Clients check bundleVersion before deserializing and can reject or fall back if they encounter an unsupported
version.
- No URL path versioning: Following OpenSearch plugin conventions, versioning is handled via the response body field rather than the URL path. This keeps the endpoint stable and gives the backend full control over schema evolution.
Is your feature request related to a problem?
Currently, PPL autocomplete in OpenSearch Dashboards relies on a frontend-maintained copy of the PPL grammar. With the nature of frequently evolving of PPL grammar, this leads to:
What solution would you like?
Introduce a new REST API (GET /_plugins/_ppl/_grammar) that serves a grammar bundle — a JSON payload containing all ANTLR grammar metadata needed to reconstruct a functional PPL lexer and parser on the OpenSearch Dashboards side at runtime. The bundle includes:
What alternatives have you considered?
Do you have any additional context?
How grammar changes are reflected in the bundle:
When the PPL grammar (.g4 files) is modified and the plugin is rebuilt, the changes propagate through the bundle as follows:
In all cases, these changes only take effect when the plugin JAR is rebuilt and deployed. The bundle is immutable for the lifetime of a given plugin build. On node restart with a new plugin version, the first request triggers a fresh bundle build with the updated grammar, and the new bundle is cached for the node's lifetime.
Grammar API schema management:
The bundleVersion field in the response body governs the schema of the grammar bundle itself (not the grammar content). This is how we manage the API contract moving forward:
version.