-
Notifications
You must be signed in to change notification settings - Fork 192
[Feature] Add grammar bundle generation API for PPL language features #5162
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
a7e0f6f
63916b2
2f862f3
98321e2
dd16569
55df662
7161842
7c42d67
53fdd91
d4b2c53
7cce850
c07d992
cf9ed36
c3f3c02
e070024
5854bf5
24d6f01
877669c
8f0ae44
f1226b0
1b467d2
97de071
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,18 @@ | ||
| { | ||
| "ppl.grammar": { | ||
| "documentation": { | ||
| "url": "https://github.com/opensearch-project/sql/blob/main/docs/user/ppl/interfaces/endpoint.md", | ||
| "description": "PPL Grammar Endpoint for Autocomplete" | ||
| }, | ||
| "stability": "experimental", | ||
| "url": { | ||
| "paths": [ | ||
| { | ||
| "path": "/_plugins/_ppl/_grammar", | ||
| "methods": ["GET"] | ||
| } | ||
| ] | ||
| }, | ||
| "params": {} | ||
| } | ||
| } |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,18 @@ | ||
| --- | ||
| "PPL grammar endpoint returns expected response shape": | ||
| - do: | ||
| ppl.grammar: {} | ||
| - is_true: bundleVersion | ||
| - is_true: antlrVersion | ||
| - is_true: grammarHash | ||
| - match: {startRuleIndex: 0} | ||
| - gt: {lexerSerializedATN.0: 0} | ||
| - is_true: lexerRuleNames.0 | ||
| - is_true: channelNames.0 | ||
| - is_true: modeNames.0 | ||
| - gt: {parserSerializedATN.0: 0} | ||
| - is_true: parserRuleNames.0 | ||
| - is_true: symbolicNames.1 | ||
| - is_true: tokenDictionary.PIPE | ||
| - gt: {ignoredTokens.0: 0} | ||
| - gt: {rulesToVisit.0: 0} |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,137 @@ | ||
| /* | ||
| * Copyright OpenSearch Contributors | ||
| * SPDX-License-Identifier: Apache-2.0 | ||
| */ | ||
|
|
||
| package org.opensearch.sql.plugin.rest; | ||
|
|
||
| import static org.opensearch.rest.RestRequest.Method.GET; | ||
|
|
||
| import com.google.common.annotations.VisibleForTesting; | ||
| import com.google.common.collect.ImmutableList; | ||
| import java.io.IOException; | ||
| import java.util.List; | ||
| import lombok.extern.log4j.Log4j2; | ||
| import org.opensearch.common.annotation.ExperimentalApi; | ||
| import org.opensearch.core.action.ActionListener; | ||
| import org.opensearch.core.rest.RestStatus; | ||
| import org.opensearch.core.xcontent.XContentBuilder; | ||
| import org.opensearch.rest.BaseRestHandler; | ||
| import org.opensearch.rest.BytesRestResponse; | ||
| import org.opensearch.rest.RestChannel; | ||
| import org.opensearch.rest.RestRequest; | ||
| import org.opensearch.sql.plugin.transport.PPLQueryAction; | ||
| import org.opensearch.sql.plugin.transport.TransportPPLQueryRequest; | ||
| import org.opensearch.sql.plugin.transport.TransportPPLQueryResponse; | ||
| import org.opensearch.sql.ppl.autocomplete.GrammarBundle; | ||
| import org.opensearch.sql.ppl.autocomplete.PPLGrammarBundleBuilder; | ||
| import org.opensearch.transport.client.node.NodeClient; | ||
|
|
||
| /* | ||
| * REST handler for {@code GET /_plugins/_ppl/_grammar}. | ||
| * | ||
| * @opensearch.experimental | ||
| */ | ||
| @ExperimentalApi | ||
| @Log4j2 | ||
| public class RestPPLGrammarAction extends BaseRestHandler { | ||
|
|
||
| private static final String ENDPOINT_PATH = "/_plugins/_ppl/_grammar"; | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We need to update endpoint documentation?
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Fixed. Added endpoint docs and stabilized doctest example output. |
||
|
|
||
| @Override | ||
| public String getName() { | ||
| return "ppl_grammar_action"; | ||
| } | ||
|
|
||
| @Override | ||
| public List<Route> routes() { | ||
| return ImmutableList.of(new Route(GET, ENDPOINT_PATH)); | ||
| } | ||
|
|
||
| @Override | ||
| protected RestChannelConsumer prepareRequest(RestRequest request, NodeClient client) | ||
| throws IOException { | ||
|
|
||
| return channel -> { | ||
| try { | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This endpoint bypasses the transport action layer that other PPL handlers use (e.g.,
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Good point. Fixed in this PR: /_plugins/_ppl/_grammar now performs an explicit transport-action authorization check via PPLQueryAction before serving the bundle, so access control is no longer only route-prefix-based. I also added security tests for grammar endpoint access with and without PPL permission. |
||
| authorizeRequest( | ||
| client, | ||
| new ActionListener<>() { | ||
| @Override | ||
| public void onResponse(TransportPPLQueryResponse ignored) { | ||
| try { | ||
| GrammarBundle bundle = getBundle(); | ||
| XContentBuilder builder = channel.newBuilder(); | ||
| serializeBundle(builder, bundle); | ||
| channel.sendResponse(new BytesRestResponse(RestStatus.OK, builder)); | ||
| } catch (Exception e) { | ||
| log.error("Error building or serializing PPL grammar", e); | ||
| sendErrorResponse(channel, e); | ||
| } | ||
| } | ||
|
|
||
| @Override | ||
| public void onFailure(Exception e) { | ||
| log.error("PPL grammar authorization failed", e); | ||
| sendErrorResponse(channel, e); | ||
| } | ||
| }); | ||
| } catch (Exception e) { | ||
| log.error("Error authorizing PPL grammar request", e); | ||
| sendErrorResponse(channel, e); | ||
| } | ||
| }; | ||
| } | ||
|
|
||
| @VisibleForTesting | ||
| protected void authorizeRequest( | ||
| NodeClient client, ActionListener<TransportPPLQueryResponse> listener) { | ||
| client.execute( | ||
| PPLQueryAction.INSTANCE, new TransportPPLQueryRequest("", null, ENDPOINT_PATH), listener); | ||
| } | ||
|
|
||
| private void sendErrorResponse(RestChannel channel, Exception e) { | ||
| try { | ||
| channel.sendResponse(new BytesRestResponse(channel, e)); | ||
| } catch (IOException ioException) { | ||
| log.error("Failed to send PPL grammar error response", ioException); | ||
| } | ||
| } | ||
|
|
||
| /** Gets the grammar bundle. Override in tests to inject a custom or failing bundle provider. */ | ||
| @VisibleForTesting | ||
| protected GrammarBundle getBundle() { | ||
| return PPLGrammarBundleBuilder.getBundle(); | ||
| } | ||
|
|
||
| private void serializeBundle(XContentBuilder builder, GrammarBundle bundle) throws IOException { | ||
| builder.startObject(); | ||
|
|
||
| // Identity & versioning | ||
| builder.field("bundleVersion", bundle.getBundleVersion()); | ||
| builder.field("antlrVersion", bundle.getAntlrVersion()); | ||
| builder.field("grammarHash", bundle.getGrammarHash()); | ||
| builder.field("startRuleIndex", bundle.getStartRuleIndex()); | ||
|
|
||
| // Lexer ATN & metadata | ||
| builder.field("lexerSerializedATN", bundle.getLexerSerializedATN()); | ||
| builder.field("lexerRuleNames", bundle.getLexerRuleNames()); | ||
| builder.field("channelNames", bundle.getChannelNames()); | ||
| builder.field("modeNames", bundle.getModeNames()); | ||
|
|
||
| // Parser ATN & metadata | ||
| builder.field("parserSerializedATN", bundle.getParserSerializedATN()); | ||
| builder.field("parserRuleNames", bundle.getParserRuleNames()); | ||
|
|
||
| // Vocabulary | ||
| builder.field("literalNames", bundle.getLiteralNames()); | ||
| builder.field("symbolicNames", bundle.getSymbolicNames()); | ||
|
|
||
| // Autocomplete configuration | ||
| builder.field("tokenDictionary", bundle.getTokenDictionary()); | ||
| builder.field("ignoredTokens", bundle.getIgnoredTokens()); | ||
| builder.field("rulesToVisit", bundle.getRulesToVisit()); | ||
|
|
||
| builder.endObject(); | ||
|
Comment on lines
+110
to
+135
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. thought: I dislike that we're going to be directly exposing ANTLR implementation details like this. It means any frontends wanting to build their own autocomplete (CLIs, CWL Console, anything else other than OSD) will need to introduce ANTLR dependencies. There's also just generally a lot of validation and conversion happening between AST/RelNode that lives outside of the antlr directly, so frontends still need to be pretty highly aware of how the backend works and per-command semantics. Did we already research & eliminate the option of making a custom format that encodes more useful information? Guessing there isn't a better way or format without some heavy lifting :/ |
||
| } | ||
| } | ||
| Original file line number | Diff line number | Diff line change | ||
|---|---|---|---|---|
|
|
@@ -119,7 +119,16 @@ public String getRequest() { | |||
| * @return true if it is an explain request | ||||
| */ | ||||
| public boolean isExplainRequest() { | ||||
| return path.endsWith("/_explain"); | ||||
| return path != null && path.endsWith("/_explain"); | ||||
| } | ||||
|
|
||||
| /** | ||||
| * Check if request is for grammar metadata endpoint. | ||||
| * | ||||
| * @return true if it is a grammar metadata request | ||||
| */ | ||||
| public boolean isGrammarRequest() { | ||||
| return path != null && path.endsWith("/_grammar"); | ||||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. path.endsWith("/_grammar") would match a hypothetical future endpoint like
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We've done the same thing historically:
It's a bit sloppy, but it works for the most part lol. Should maybe revisit using proper path routing some time, I'd like that as it'd make it easier to navigate how our code responds to requests. |
||||
| } | ||||
|
|
||||
| /** Decide on the formatter by the requested format. */ | ||||
|
|
||||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No integration test covers this new endpoint. Without an integ-test or yamlRestTest, there's no verification that the endpoint works when the plugin is loaded in a real OpenSearch node (route registration, serialization over the wire, security plugin interaction). Add at least a basic integ-test that hits the endpoint and validates the response shape.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed. Added YAML REST API spec + integration test for /_plugins/_ppl/_grammar response shape.