Skip to content

[FLINK-39092][SQL] Enhance EXPLAIN plan to display watermark specification#27611

Open
featzhang wants to merge 7 commits intoapache:masterfrom
featzhang:feature/watermark-display
Open

[FLINK-39092][SQL] Enhance EXPLAIN plan to display watermark specification#27611
featzhang wants to merge 7 commits intoapache:masterfrom
featzhang:feature/watermark-display

Conversation

@featzhang
Copy link
Copy Markdown
Member

@featzhang featzhang commented Feb 14, 2026

What is the purpose of the change

This change enhances the EXPLAIN plan output for TableScan nodes by explicitly displaying watermark specifications. Currently, watermark logic is difficult to identify in EXPLAIN plans, which creates challenges for streaming users trying to understand and debug watermark strategies.

Before:

TableScan(table=[[default_catalog, default_database, orders]], fields=[user_id, order_time])

After:

TableScan:
  table: [[default_catalog, default_database, orders]]
  fields: user_id, order_time
  watermark: order_time - order_time - INTERVAL '5' SECOND

This improvement makes watermark strategies immediately visible in query plans, helping streaming users:

  • Quickly understand watermark configurations
  • Debug late data handling issues
  • Verify watermark logic without examining table DDL
  • Improve overall development and troubleshooting experience

Brief change log

  • Modified FlinkLogicalTableSourceScan.explainTerms() to extract and display watermark specifications
  • Added watermark information retrieval from ResolvedSchema via TableSourceTable
  • Enhanced explain output to show rowtime attribute and watermark expression in a readable format
  • Watermark is displayed only when present, maintaining backward compatibility for non-streaming tables

Verifying this change

This change can be verified by:

  1. Compilation: The module compiles successfully without errors

    ./mvnw clean spotless:apply install -DskipTests -Pfast -pl flink-table/flink-table-planner
  2. Create table with watermark:

    CREATE TABLE orders (
      user_id INT,
      order_time TIMESTAMP(3),
      WATERMARK FOR order_time AS order_time - INTERVAL '5' SECOND
    ) WITH (...);
  3. Verify EXPLAIN output:

    EXPLAIN SELECT * FROM orders;

    The output should display watermark information in the TableScan node.

  4. Test without watermark: Verify that tables without watermarks still work normally

    CREATE TABLE batch_table (id INT, name STRING) WITH (...);
    EXPLAIN SELECT * FROM batch_table;

Does this pull request potentially affect

  • Dependencies: No
  • The public API: No (only changes internal explain output format)
  • The serializers: No
  • The runtime per-record code paths: No (only affects plan display)
  • Anything that affects deployment or recovery: No

Documentation

  • This change enhances the readability of query plan output for streaming jobs
  • No user-facing API documentation changes required
  • The improvement is visible in EXPLAIN plan output
  • Maintains backward compatibility by only enhancing display format, not changing RelNode structure
  • Particularly valuable for streaming processing users working with event time and watermarks

@featzhang featzhang changed the title [table] Enhance EXPLAIN plan to display watermark specification [FLINK-39092][SQL] Enhance EXPLAIN plan to display watermark specification Feb 14, 2026
@flinkbot
Copy link
Copy Markdown
Collaborator

flinkbot commented Feb 14, 2026

CI report:

Bot commands The @flinkbot bot supports the following commands:
  • @flinkbot run azure re-run the last Azure build

@featzhang featzhang force-pushed the feature/watermark-display branch from 9f1095b to 322493c Compare March 4, 2026 00:58
@featzhang featzhang closed this Mar 16, 2026
@featzhang featzhang reopened this Mar 16, 2026
@featzhang featzhang force-pushed the feature/watermark-display branch from 322493c to 0684335 Compare March 17, 2026 13:15
featzhang added a commit to featzhang/flink that referenced this pull request Apr 22, 2026
…atermark

The previous implementation used addDataStream + addTableWithWatermark which creates
a LogicalWatermarkAssigner node but doesn't add watermark specifications to the table
source's resolved schema. This caused the test to fail because FlinkLogicalTableSourceScan's
explainTerms only displays watermark info when the table itself has watermark specs.

Solution: Use DDL with WATERMARK FOR clause to properly create a table with watermark
specifications in the resolved schema, ensuring the watermark display feature is tested
correctly.

Addresses CI failure in PR apache#27611.
featzhang added a commit to featzhang/flink that referenced this pull request Apr 23, 2026
…atermark

The previous implementation used addDataStream + addTableWithWatermark which creates
a LogicalWatermarkAssigner node but doesn't add watermark specifications to the table
source's resolved schema. This caused the test to fail because FlinkLogicalTableSourceScan's
explainTerms only displays watermark info when the table itself has watermark specs.

Solution: Use DDL with WATERMARK FOR clause to properly create a table with watermark
specifications in the resolved schema, ensuring the watermark display feature is tested
correctly.

Addresses CI failure in PR apache#27611.
@featzhang featzhang force-pushed the feature/watermark-display branch from d9f280a to 444d397 Compare April 23, 2026 00:51
@featzhang
Copy link
Copy Markdown
Member Author

@flinkbot run azure

featzhang and others added 7 commits May 4, 2026 02:45
Add explicit watermark information in TableScan nodes to improve plan readability.
This change helps streaming users quickly understand watermark strategies from EXPLAIN output.

Example output:
TableScan:
  fields: user_id, order_time
  watermark: order_time - order_time - INTERVAL '5' SECOND
Update test expected results to include watermark specification
in FlinkLogicalTableSourceScan output:

- ProjectSnapshotTransposeRuleTest.xml
- PushFilterInCalcIntoTableSourceRuleTest.xml
- PushWatermarkIntoTableSourceScanRuleTest.xml
- Added testExplainWithWatermark test case to verify watermark information is included in explain plans
- Addresses review comment by davidradl requesting test coverage for explainTerms watermark functionality
…atermark

The previous implementation used addDataStream + addTableWithWatermark which creates
a LogicalWatermarkAssigner node but doesn't add watermark specifications to the table
source's resolved schema. This caused the test to fail because FlinkLogicalTableSourceScan's
explainTerms only displays watermark info when the table itself has watermark specs.

Solution: Use DDL with WATERMARK FOR clause to properly create a table with watermark
specifications in the resolved schema, ensuring the watermark display feature is tested
correctly.

Addresses CI failure in PR apache#27611.
- Apply spotless formatting to multi-line DDL string in testExplainWithWatermark
- Add expected plan output for testExplainWithWatermark in ExplainTest.xml
  to resolve DiffRepository assertion failure
This is an empty commit to trigger a fresh CI run.
The previous CI failure was due to a flaky test (RestClientTest.testRestClientClosedHandling)
which is unrelated to the watermark EXPLAIN feature changes in this PR.
@featzhang featzhang force-pushed the feature/watermark-display branch from 8180e66 to 01fa5eb Compare May 3, 2026 18:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants