PHOENIX-7751 : [SyncTable Tool] Feature to validate table data using PhoenixSyncTable tool b/w source and target cluster#2379
Conversation
|
|
||
| /** | ||
| * PhoenixSyncTableTool chunk metadata cell qualifiers. These define the wire protocol between | ||
| * hoenixSyncTableRegionScanner (server-side coprocessor) and PhoenixSyncTableMapper (client-side |
|
|
||
| public static Long getPhoenixSyncTableFromTime(Configuration conf) { | ||
| Preconditions.checkNotNull(conf); | ||
| String value = conf.get(PHOENIX_SYNC_TABLE_FROM_TIME); |
There was a problem hiding this comment.
Why didn't you use conf.getLong() ?
| conf.setLong(PHOENIX_SYNC_TABLE_TO_TIME, toTime); | ||
| } | ||
|
|
||
| public static Long getPhoenixSyncTableToTime(Configuration conf) { |
There was a problem hiding this comment.
Here also why didn't you use conf.getLong ?
Fixed it for my changes. |
| // Not using try-with-resources since ChunkScannerContext owns the table lifecycle | ||
| Table hTable = | ||
| conn.unwrap(PhoenixConnection.class).getQueryServices().getTable(physicalTableName); | ||
| Scan scan = |
There was a problem hiding this comment.
@rahulLiving One thing I realized we are not setting ttl attribute on the scan. We should so that we can mask expired rows. Also, add a test case for the same.
| scan.setAttribute(BaseScannerRegionObserverConstants.SYNC_TABLE_CHUNK_FORMATION, TRUE_BYTES); | ||
| scan.setAttribute(BaseScannerRegionObserverConstants.SKIP_REGION_BOUNDARY_CHECK, TRUE_BYTES); | ||
| scan.setAttribute(BaseScannerRegionObserverConstants.UNGROUPED_AGG, TRUE_BYTES); | ||
| ScanUtil.setScanAttributesForPhoenixTTL(scan, pTable, phoenixConn); |
There was a problem hiding this comment.
Shouldn't expired rows get skipped even in non-strict mode? Perhaps we should override the table attribute?
| waitForReplication(targetConnection, uniqueTableName, 7); | ||
|
|
||
| int sourceCount = getRowCount(sourceConnection, uniqueTableName); | ||
| int targetCount = getRowCount(targetConnection, uniqueTableName); |
There was a problem hiding this comment.
We won't know if the data got really skipped as the queries for validation would also have skipped them. We can perhaps alter the attribute before running the verification queries, I am not sure.
Alternatively, if we always run in strict-mode like I suggested above, we can create the table with IS_STRICT_TTL=false and we don't need to do anything extra here to verify.
haridsv
left a comment
There was a problem hiding this comment.
As discussed offline, we need one +ve test scenario like testSyncTableWithConditionalTTLExpiredRows for TTL that verifies that sync tool passes and then run a compaction on target, confirm that row count is 7 vs 10 and then repeat the sync tool to confirm that it still passes.
|
@rahulLiving we don't define pom dependency with version on individual module poms e.g. This dependency with version should not be defined in core or core-server poms. We should define the versions/scopes on main pom. On inidividual poms, we only need to refer to them, maybe scope is fine but version is not needed. |
|
I also don't see jenkins build results for any commits after a951251 |
|
Yes @virajjasani I observed the build result was not showing up as PR validation status on PR from some time back. I was monitoring the build status from Jenkins build page for this PR. |
Sure. I have a follow up PR in pipeline. Will include this change there. Thanks for checking |
|
Thank you @rahulLiving |
|
Just one more question - does this dependency work for all JDKs? i.e. from 8 to 17 that we support? |
#2417) * PHOENIX-7751: Feature to validate table data using PhoenixSyncTable tool b/w source and target cluster (#2379) --------- Co-authored-by: Rahul Kumar <rahul.kumar@salesforce.com>
…ool b/w source and target cluster (#2419) * PHOENIX-7751: Feature to validate table data using PhoenixSyncTable tool b/w source and target cluster (#2379) * 5.2 compatible changes --------- Co-authored-by: Rahul Kumar <rahul.kumar@salesforce.com>
No description provided.