Skip to content

Add Iceberg tag time travel support#4211

Open
sfc-gh-igarish wants to merge 1 commit into
mainfrom
igarish/snowpark-python-iceberg-tags
Open

Add Iceberg tag time travel support#4211
sfc-gh-igarish wants to merge 1 commit into
mainfrom
igarish/snowpark-python-iceberg-tags

Conversation

@sfc-gh-igarish
Copy link
Copy Markdown

Add support for reading Iceberg tables using snapshot tags. This enables time travel queries using Iceberg snapshot tags with the AT(ICEBERG_TAG) syntax.

Changes:

  • Add iceberg_tag field to TimeTravelConfig NamedTuple
  • Update validate_and_normalize_params to handle iceberg_tag (only works with time_travel_mode='at')
  • Update generate_sql_clause to produce AT(ICEBERG_TAG => 'tag_name')
  • Add iceberg_tag parameter to Table.init, Session.table(), and DataFrameReader.table()
  • Add TAG and ICEBERG_TAG options mapping in DataFrameReader
  • Add unit tests for iceberg_tag time travel functionality
  • Update AST proto with iceberg_tag field

Usage:

Direct parameter session.table("my_iceberg_table", time_travel_mode="at", iceberg_tag="v1")

Via DataFrameReader option (Spark-compatible) session.read.option("tag", "v1").table("my_iceberg_table")

  1. Which Jira issue is this PR addressing? Make sure that there is an accompanying issue to your PR.

    Fixes SNOW-NNNNNNN

  2. Fill out the following pre-review checklist:

    • I am adding a new automated test(s) to verify correctness of my new code
      • If this test skips Local Testing mode, I'm requesting review from @snowflakedb/local-testing
    • I am adding new logging messages
    • I am adding a new telemetry message
    • I am adding new credentials
    • I am adding a new dependency
    • If this is a new feature/behavior, I'm adding the Local Testing parity changes.
    • I acknowledge that I have ensured my changes to be thread-safe. Follow the link for more information: Thread-safe Developer Guidelines
    • If adding any arguments to public Snowpark APIs or creating new public Snowpark APIs, I acknowledge that I have ensured my changes include AST support. Follow the link for more information: AST Support Guidelines
  3. Please describe how your code solves the related issue.

    Please write a short description of how your code change solves the related issue.

Add support for reading Iceberg tables using snapshot tags. This enables
time travel queries using Iceberg snapshot tags with the AT(ICEBERG_TAG)
syntax.

Changes:
- Add iceberg_tag field to TimeTravelConfig NamedTuple
- Update validate_and_normalize_params to handle iceberg_tag (only works
  with time_travel_mode='at')
- Update generate_sql_clause to produce AT(ICEBERG_TAG => 'tag_name')
- Add iceberg_tag parameter to Table.__init__, Session.table(), and
  DataFrameReader.table()
- Add TAG and ICEBERG_TAG options mapping in DataFrameReader
- Add unit tests for iceberg_tag time travel functionality
- Update AST proto with iceberg_tag field

Usage:
  # Direct parameter
  session.table("my_iceberg_table", time_travel_mode="at", iceberg_tag="v1")

  # Via DataFrameReader option (Spark-compatible)
  session.read.option("tag", "v1").table("my_iceberg_table")

Co-authored-by: Cursor <cursoragent@cursor.com>
@sfc-gh-igarish sfc-gh-igarish requested review from a team as code owners May 4, 2026 22:31
@sfc-gh-igarish sfc-gh-igarish requested review from sfc-gh-aalam, sfc-gh-aling and sfc-gh-yuwang and removed request for sfc-gh-aling May 4, 2026 22:31
@codecov-commenter
Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 38.09524% with 13 lines in your changes missing coverage. Please review.
✅ Project coverage is 93.95%. Comparing base (e56041f) to head (cdd1ba1).
⚠️ Report is 3 commits behind head on main.

Files with missing lines Patch % Lines
src/snowflake/snowpark/dataframe_reader.py 25.00% 7 Missing and 2 partials ⚠️
src/snowflake/snowpark/session.py 0.00% 1 Missing and 1 partial ⚠️
src/snowflake/snowpark/table.py 0.00% 1 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #4211      +/-   ##
==========================================
- Coverage   95.42%   93.95%   -1.47%     
==========================================
  Files         171      171              
  Lines       43835    43857      +22     
  Branches     7513     7520       +7     
==========================================
- Hits        41829    41207     -622     
- Misses       1226     1859     +633     
- Partials      780      791      +11     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

google.protobuf.StringValue time_travel_mode = 7;
Expr timestamp = 8;
google.protobuf.StringValue timestamp_type = 9;
google.protobuf.StringValue iceberg_tag = 10;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the client change generated from the monorepo AST definition?

I remember a client AST change isn't strictly required right now; if it's needed, it should be derived from the monorepo updates -- there is a step by step doc for the AST mono repo updates.

@sfc-gh-heshah what's your recommendation here? can we just not add the ast in the PR and do that later?

elif self.stream is not None:
clause += f"(STREAM => '{self.stream}')"
elif self.iceberg_tag is not None:
clause += f"(ICEBERG_TAG => '{self.iceberg_tag}')"
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm unable to see this feature in public doc.
what release status of this iceberg tag feature?

"TIMESTAMP_TYPE": "timestamp_type",
"STREAM": "stream",
"ICEBERG_TAG": "iceberg_tag",
"TAG": "iceberg_tag",
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When using tag as an alias for iceberg_tag, is that mapping defined within the Snowflake Iceberg spec or it's a spark spec.

snowflake has its own "tag" concept, will there be future conflict under the context of dataframe reader?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants