Skip to content

Comments

MLE-27078 Incremental write now uses an unsignedLong#1908

Merged
rjrudin merged 1 commit intodevelopfrom
feature/optic-tweak
Feb 23, 2026
Merged

MLE-27078 Incremental write now uses an unsignedLong#1908
rjrudin merged 1 commit intodevelopfrom
feature/optic-tweak

Conversation

@rjrudin
Copy link
Contributor

@rjrudin rjrudin commented Feb 23, 2026

Not sure eval will be kept, but all 3 approaches are now properly using an unsignedLong.

Also fixed a performance issue in the view filter where "op.in" was being used instead of a "where" + documentQuery.

@rjrudin rjrudin requested a review from BillFarber as a code owner February 23, 2026 15:47
Copilot AI review requested due to automatic review settings February 23, 2026 15:47
@rjrudin rjrudin requested a review from stevebio as a code owner February 23, 2026 15:47
@github-actions
Copy link

github-actions bot commented Feb 23, 2026

Copyright Validation Results
Total: 9 | Passed: 6 | Failed: 0 | Skipped: 3 | at: 2026-02-23 16:07:13 UTC | commit: f8e8c87

⏭️ Skipped (Excluded) Files

  • Jenkinsfile
  • test-app/src/main/ml-config/databases/content-database.json
  • test-app/src/main/ml-schemas/tde/incrementalWriteHash.json

✅ Valid Files

  • marklogic-client-api/src/main/java/com/marklogic/client/datamovement/filter/IncrementalWriteEvalFilter.java
  • marklogic-client-api/src/main/java/com/marklogic/client/datamovement/filter/IncrementalWriteFilter.java
  • marklogic-client-api/src/main/java/com/marklogic/client/datamovement/filter/IncrementalWriteOpticFilter.java
  • marklogic-client-api/src/main/java/com/marklogic/client/datamovement/filter/IncrementalWriteViewFilter.java
  • marklogic-client-api/src/test/java/com/marklogic/client/datamovement/filter/IncrementalWriteFilterTest.java
  • marklogic-client-api/src/test/java/com/marklogic/client/datamovement/filter/IncrementalWriteTest.java

✅ All files have valid copyright headers!

@rjrudin rjrudin force-pushed the feature/optic-tweak branch 2 times, most recently from ec5eabd to d6519e4 Compare February 23, 2026 15:50
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates incremental write hashing to use an unsigned long representation consistently across Optic, View (TDE), and Eval-based approaches, and improves the View-based filter performance by switching to a where + cts.documentQuery predicate.

Changes:

  • Store and compare incremental write hashes as unsigned longs (serialized as base-10 strings in metadata).
  • Update Optic/View/Eval filters and tests to parse/handle unsigned long hashes.
  • Optimize View filter row selection by replacing op.in with where(op.cts.documentQuery(...)).

Reviewed changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
test-app/src/main/ml-schemas/tde/incrementalWriteHash.json Updates TDE view column name/type for the hash to unsignedLong.
test-app/src/main/ml-config/databases/content-database.json Switches range index scalar types for hash fields to unsignedLong.
marklogic-client-api/src/main/java/com/marklogic/client/datamovement/filter/IncrementalWriteFilter.java Changes hash computation/storage to unsigned-long semantics (base-10 string in metadata).
marklogic-client-api/src/main/java/com/marklogic/client/datamovement/filter/IncrementalWriteOpticFilter.java Parses existing hashes as unsigned longs from lexicon results.
marklogic-client-api/src/main/java/com/marklogic/client/datamovement/filter/IncrementalWriteViewFilter.java Uses cts.documentQuery in where and parses hashes as unsigned longs from the view.
marklogic-client-api/src/main/java/com/marklogic/client/datamovement/filter/IncrementalWriteEvalFilter.java Adjusts eval script/response parsing for unsigned long handling.
marklogic-client-api/src/test/java/com/marklogic/client/datamovement/filter/IncrementalWriteTest.java Adds cross-compat tests between eval/optic filters and updates hash validation parsing.
marklogic-client-api/src/test/java/com/marklogic/client/datamovement/filter/IncrementalWriteFilterTest.java Updates unit test to reflect numeric hash inputs/storage.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 22 to 30
// The hash value is cast to a String based on this analysis from Copilot:
// "Since the field index is unsignedLong, cts.valueTuples returns an
// xs:unsignedLong value, which JavaScript represents as an IEEE 754
// double. Values above −1 (~9.2 quadrillion) would silently lose precision."
// It is then cast back to an unsignedLong when the value is retrieved
// from the JSON response object.
private static final String EVAL_SCRIPT = """
const tuples = cts.valueTuples([cts.uriReference(), cts.fieldReference(hashKeyName)], null, cts.documentQuery(uris));
const response = {};
Copy link

Copilot AI Feb 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The inline comment references "analysis from Copilot" and includes an incorrect/unclear numeric bound ("Values above −1"). Please rewrite this as a source-neutral explanation (e.g., JavaScript numeric precision above 2^53−1) and keep the rationale accurate and concise.

Suggested change
// The hash value is cast to a String based on this analysis from Copilot:
// "Since the field index is unsignedLong, cts.valueTuples returns an
// xs:unsignedLong value, which JavaScript represents as an IEEE 754
// double. Values above −1 (~9.2 quadrillion) would silently lose precision."
// It is then cast back to an unsignedLong when the value is retrieved
// from the JSON response object.
private static final String EVAL_SCRIPT = """
const tuples = cts.valueTuples([cts.uriReference(), cts.fieldReference(hashKeyName)], null, cts.documentQuery(uris));
const response = {};
// The hash field index is xs:unsignedLong, which JavaScript represents as an
// IEEE 754 double. To avoid loss of precision for large integers (e.g., above
// 2^53−1), the value is converted to a String in JavaScript and then parsed
// back to an unsigned long when it is read from the JSON response.
private static final String EVAL_SCRIPT = """
const tuples = cts.valueTuples([cts.uriReference(), cts.fieldReference(hashKeyName)], null, cts.documentQuery(uris));
const response = {};
const tuples = cts.valueTuples([cts.uriReference(), cts.fieldReference(hashKeyName)], null, cts.documentQuery(uris));
const response = {};

Copilot uses AI. Check for mistakes.
BillFarber
BillFarber previously approved these changes Feb 23, 2026
Not sure eval will be kept, but all 3 approaches are now properly using an unsignedLong.

Also fixed a performance issue in the view filter where "op.in" was being used instead of a "where" + documentQuery.
@rjrudin rjrudin merged commit 2e84d2d into develop Feb 23, 2026
5 checks passed
@rjrudin rjrudin deleted the feature/optic-tweak branch February 23, 2026 16:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants