Skip to content

PHOENIX-7751 : [SyncTable Tool] Feature to validate table data using PhoenixSyncTable tool b/w source and target cluster#2379

Merged
haridsv merged 41 commits intoapache:masterfrom
rahulLiving:PHOENIX-7751
Apr 14, 2026
Merged

PHOENIX-7751 : [SyncTable Tool] Feature to validate table data using PhoenixSyncTable tool b/w source and target cluster#2379
haridsv merged 41 commits intoapache:masterfrom
rahulLiving:PHOENIX-7751

Conversation

@rahulLiving
Copy link
Copy Markdown
Contributor

No description provided.

@rahulLiving rahulLiving marked this pull request as ready for review March 12, 2026 12:36

/**
* PhoenixSyncTableTool chunk metadata cell qualifiers. These define the wire protocol between
* hoenixSyncTableRegionScanner (server-side coprocessor) and PhoenixSyncTableMapper (client-side
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo missing 'P'


public static Long getPhoenixSyncTableFromTime(Configuration conf) {
Preconditions.checkNotNull(conf);
String value = conf.get(PHOENIX_SYNC_TABLE_FROM_TIME);
Copy link
Copy Markdown
Contributor

@tkhurana tkhurana Mar 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why didn't you use conf.getLong() ?

conf.setLong(PHOENIX_SYNC_TABLE_TO_TIME, toTime);
}

public static Long getPhoenixSyncTableToTime(Configuration conf) {
Copy link
Copy Markdown
Contributor

@tkhurana tkhurana Mar 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here also why didn't you use conf.getLong ?

@rahulLiving
Copy link
Copy Markdown
Contributor Author

I see a generics compiler warning that can be fixed with the following change:

Fixed it for my changes.

Comment thread phoenix-core-client/src/main/java/org/apache/phoenix/util/SHA256DigestUtil.java Outdated
// Not using try-with-resources since ChunkScannerContext owns the table lifecycle
Table hTable =
conn.unwrap(PhoenixConnection.class).getQueryServices().getTable(physicalTableName);
Scan scan =
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rahulLiving One thing I realized we are not setting ttl attribute on the scan. We should so that we can mask expired rows. Also, add a test case for the same.

scan.setAttribute(BaseScannerRegionObserverConstants.SYNC_TABLE_CHUNK_FORMATION, TRUE_BYTES);
scan.setAttribute(BaseScannerRegionObserverConstants.SKIP_REGION_BOUNDARY_CHECK, TRUE_BYTES);
scan.setAttribute(BaseScannerRegionObserverConstants.UNGROUPED_AGG, TRUE_BYTES);
ScanUtil.setScanAttributesForPhoenixTTL(scan, pTable, phoenixConn);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't expired rows get skipped even in non-strict mode? Perhaps we should override the table attribute?

waitForReplication(targetConnection, uniqueTableName, 7);

int sourceCount = getRowCount(sourceConnection, uniqueTableName);
int targetCount = getRowCount(targetConnection, uniqueTableName);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We won't know if the data got really skipped as the queries for validation would also have skipped them. We can perhaps alter the attribute before running the verification queries, I am not sure.

Alternatively, if we always run in strict-mode like I suggested above, we can create the table with IS_STRICT_TTL=false and we don't need to do anything extra here to verify.

Copy link
Copy Markdown
Contributor

@haridsv haridsv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed offline, we need one +ve test scenario like testSyncTableWithConditionalTTLExpiredRows for TTL that verifies that sync tool passes and then run a compaction on target, confirm that row count is 7 vs 10 and then repeat the sync tool to confirm that it still passes.

@haridsv haridsv merged commit f5f093e into apache:master Apr 14, 2026
@virajjasani
Copy link
Copy Markdown
Contributor

@rahulLiving we don't define pom dependency with version on individual module poms e.g.

    <dependency>
      <groupId>org.bouncycastle</groupId>
      <artifactId>bcprov-jdk18on</artifactId>
      <version>1.79</version>
      <scope>test</scope>
    </dependency>

This dependency with version should not be defined in core or core-server poms. We should define the versions/scopes on main pom. On inidividual poms, we only need to refer to them, maybe scope is fine but version is not needed.

@virajjasani
Copy link
Copy Markdown
Contributor

I also don't see jenkins build results for any commits after a951251

@rahulLiving
Copy link
Copy Markdown
Contributor Author

Yes @virajjasani I observed the build result was not showing up as PR validation status on PR from some time back. I was monitoring the build status from Jenkins build page for this PR.

@rahulLiving
Copy link
Copy Markdown
Contributor Author

@rahulLiving we don't define pom dependency with version on individual module poms e.g.

Sure. I have a follow up PR in pipeline. Will include this change there. Thanks for checking

@virajjasani
Copy link
Copy Markdown
Contributor

Thank you @rahulLiving

@virajjasani
Copy link
Copy Markdown
Contributor

Just one more question - does this dependency work for all JDKs? i.e. from 8 to 17 that we support?

haridsv pushed a commit that referenced this pull request Apr 28, 2026
#2417)

* PHOENIX-7751: Feature to validate table data using PhoenixSyncTable tool b/w source and target cluster (#2379)

---------

Co-authored-by: Rahul Kumar <rahul.kumar@salesforce.com>
haridsv pushed a commit that referenced this pull request Apr 28, 2026
…ool b/w source and target cluster (#2419)

* PHOENIX-7751: Feature to validate table data using PhoenixSyncTable tool b/w source and target cluster (#2379)

* 5.2 compatible changes

---------

Co-authored-by: Rahul Kumar <rahul.kumar@salesforce.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants