Add validation infrastructure and type system#5213
Add validation infrastructure and type system#5213yuancu wants to merge 1 commit intoopensearch-project:feature/validationfrom
Conversation
PR Reviewer Guide 🔍(Review updated until commit 306e276)Here are some key observations to aid the review process:
|
PR Code Suggestions ✨Latest suggestions up to 306e276 Explore these optional code suggestions:
Previous suggestionsSuggestions up to commit aca9781
Suggestions up to commit 9eb9c4f
|
|
Persistent review updated to latest commit aca9781 |
This PR adds the foundation for PPL operand type validation: - Add OperandTypeChecker interface and implementations (fixed-arity, variadic, composite, etc.) - Add TypeFamily enum for categorizing SQL/PPL types - Add ValidationRule and ValidationContext for the validation pipeline - Add ValidatingRelNodeVisitor for walking Calcite rel trees - Wire validation infrastructure into CalcitePPLAbstractModule - Update expected output files for explain tests - None of the validation logic is enabled yet - it will be turned on in a subsequent PR Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
|
Persistent review updated to latest commit 306e276 |
Summary
Sub-PR 1/4 for #4892 — splitting the large PR into reviewable pieces targeting
feature/validation.This PR adds all validation infrastructure and type system code without enabling validation in the execution pipeline. All existing tests pass unchanged.
What's included
Core Validation Pipeline (new
calcite/validate/package):PplValidator— Custom SQL validator for PPL with UDT-awarederiveTypeValidationUtils— Type sync, UDT creation helpersOpenSearchSparkSqlDialect— Extended Spark SQL dialect for OpenSearchPplConvertletTable,SqlOperatorTableProviderConverters (RelNode ↔ SqlNode round-trip):
OpenSearchRelToSqlConverter— Handles SEMI/ANTI joins, IN/NOT IN tuple rewriting, sort preservationOpenSearchSqlToRelConverter— Hint strategy, field trimming, JSON_TYPE disablingShuttles:
PplRelToSqlRelShuttle— Interval literal fixing, collation embedding, bucket_nullable hintsSkipRelValidationShuttle— Detects patterns that should skip validation (bin-on-timestamp, group-by-CASE, LogicalValues)SqlRewriteShuttle— Removes database qualifiers before validationType System:
OpenSearchTypeFactory/OpenSearchTypeUtil— UDT ↔ SqlTypeName mapping, composite typesPplTypeCoercion/PplTypeCoercionRule— Custom coercion blacklist/whitelistCoercionUtils— SAFE_CAST vs CAST decision logicIntegration Point:
QueryService— Addsvalidate()anddoValidate()methods (NOT yet wired into execute/explain flow — that happens in Sub-PR C)Unit Tests
All new code has unit tests under
calcite/validate/andcalcite/utils/.Dependency
Independent — can be reviewed and merged in parallel with Sub-PR B (#5214).
graph LR A["#5213 Infrastructure + Type System"] --> C["#5215 Enable Validation + Tests"] B["#5214 Function Operand Types"] --> C C --> D["#5216 Cleanup + Docs"] style A fill:#4da6ff,color:#fff style B fill:#ccc,color:#333 style C fill:#ccc,color:#333 style D fill:#ccc,color:#333Signed-off-by: Yuanchun Shen yuanchu@amazon.com