Add `RoundNumberHeuristics` for change identification by 0xZaddyy · Pull Request #48 · payjoin/tx-indexer

0xZaddyy · 2026-04-06T21:01:41Z

This pr adds Round number heuristic for change detection based on the decimal Hamming weight of output values round-like amounts are classified as payments, while higher-weight values are treated as change.

0xZaddyy · 2026-04-07T07:15:04Z

@arminsabouri one edge case we should account for is payments that look high-precision in BTC but are actually round amounts in fiat. The current implementation only detects round values denominated in BTC, so we may miss these cases.

arminsabouri · 2026-04-07T13:02:06Z

@arminsabouri one edge case we should account for is payments that look high-precision in BTC but are actually round amounts in fiat. The current implementation only detects round values denominated in BTC, so we may miss these cases.

Thats fine lets open a ticket and resolve that later

arminsabouri · 2026-04-07T16:15:26Z

@0xZaddyy Thanks for the last commit. Please let me know when this is ready for review (currently its drafted).
Couple things:

Lets try to seperate out the hamming weight calculation into its own file with tests. It will be useful for other things in the future. This should be its own commit
And lets write a test or two for the change classification.

Over all this is moving in the right direction.

0xZaddyy · 2026-04-07T17:35:12Z

@arminsabouri you can have a look

arminsabouri

Had a couple questions and nits

arminsabouri · 2026-04-07T18:07:16Z

+
+        // If there's only one output, it's not change
+        if outputs.len() == 1 {
+            return TxOutChangeAnnotation::NotChange;


For future work that we should ticket out. We need a seperate variant (depending on how we define change), ProbablyNotChange. Or define this as a probability. If there is one output it could be a transfer, a consolidation or something else (rune, opreturn more generally). But depending on the other information maybe one conclusion is more likely.

For example, perhaps the one output is reusing an address from the inputs. This is likely change. However if its an opreturn we know it can't be change. Or what if we have seen the output spk in a different cluster belonging to a different wallet. Then we know this is likely not a change again.

0xZaddyy · 2026-04-07T21:22:00Z

Hello @arminsabouri,many of the issues you pointed out were leftovers from my earlier cleanup of the implementation. I’ve now removed them.

arminsabouri · 2026-04-08T01:48:27Z

@0xZaddyy This repo follows the same commit hygine rules at rust-payjoin. Perhaps its time we create a contributing.md. Can we please squash some of your commits. Mainly the last one and the ones that add tests to previous commits

0xZaddyy · 2026-04-14T08:55:16Z

fe0eff9 introduces a contextual outlier heuristic aimed at reducing overly aggressive change classification(causing false positives).
The idea behind the current direction is:
An output is considered a candidate only if:
• it is a clear outlier (its weight exceeds all others by a minimum gap), and
• The remaining outputs are mostly low-weight / round-like.
Then force a constraint:
• if exactly one output satisfies the condition → classify as Change
• otherwise → return NotChange or Inconclusive (if there is not enough information or doesn't satisfy)
This idea better captures the idea that roundness is only useful when one output stands out against a round-looking background, rather than just selecting the maximum

So cases like
[1, 1, 6] -> this gives a strong signal
[5, 6, 7] -> this is weak(abstain)
[1, 1, 6, 7] -> ambiguous (abstain)

arminsabouri

ConceptAck. This is moving in the right direction. I had a couple questoins about the consts chosen -- so I expect answers on those questions not just code changes please. Lastly I think there is a more concrete statistical model here that may remove the need for consts -- I dont have a rec rn i will look into it

arminsabouri · 2026-04-14T13:09:59Z

+
+        let weights: Vec<u32> = outputs
+            .iter()
+            .map(|out| decimal_hamming_weight(out.value().to_sat()))


In the future we may want to write a convenience method on AbstractTransaction to get iter<item=HammingWeight>. I imagine other parts of the codebase may need this

We can introduce this when more heuristics need it.

arminsabouri · 2026-04-14T13:15:59Z

            TxOutChangeAnnotation::Change
        );
    }
+


Lets also write a test to cover the core logic of is_candidate.

arminsabouri · 2026-04-14T13:21:31Z

+    const LOW_WEIGHT_THRESHOLD: u32 = 2;
+    /// Minimum gap between a candidate's weight and the next-highest weight
+    /// for the candidate to clearly stand out as change.
+    const MIN_OUTLIER_GAP: u32 = 2;


can you breifly explain where you got these values?

So, for LOW_WEIGHT_THRESHOLD, when I think about it, it roughly captures the boundary between the human-chosen amount (payment) and the computed leftover (change).
1.0 btc(100,000,000sats) looks very round
0.5 is roundish
0.014 still roundish
It starts getting irregular from above 2(0.0356)
<= 2 still looks plausibly human-chosen
for MIN_OUTLIER_GAP setting it to 2 also looks fair, setting it to 1 candidates [1, 2, 3] 3 is the highest, but can be misleading and no strong evidence. but with gap = 2 with candidates like [1, 1, 3] 3 clearly stands out while other looks round

Did you take a look at what other projects set these consts to (blocksci)?

implement Hamming weight roundness score Extract decimal Hamming weight into reusable module Move the non-zero base-10 digit count out of change_identification into its own hamming_weight module with unit tests, and import it back where it's used Add tests for RoundNumberHeuristics change classification Remove obsolete round-amount checks Clean up leftover logic from the previous implementation, including methods like `is_round_amount`, which are no longer needed with the Hamming weight-based heuristic.

Introduce an `Inconclusive` state to represent cases where a heuristic does not have enough evidence to classify an output as change or not.

Require a sole high-weight outlier against a mostly-round background before flagging change. Return `Inconclusive` when the signal is ambiguous.

Introduce clearer `Change` / `NotChange` / `Inconclusive` semantics, add relative comparison for non-change detection, and remove unnecessary allocations in candidate evaluation.

Mshehu5

cACK ,
Nice work!
This will be very helpful once paired with suggested changes in #49
requested changes are minor and are just stylistic
P.S Seems this might still have conflicts

Mshehu5 · 2026-04-20T18:20:31Z

        let outputs: Vec<_> = tx.outputs().collect();

        if outputs.len() <= 1 {
-            return TxOutChangeAnnotation::Inconclusive;
+            return TxOutChangeAnnotation::NotChange;
        }

+        debug_assert!(vout < outputs.len());
+
        let weights: Vec<u32> = outputs
            .iter()
            .map(|out| decimal_hamming_weight(out.value().to_sat()))
            .collect();


Suggested change

let outputs: Vec<_> = tx.outputs().collect();

if outputs.len() <= 1 {

return TxOutChangeAnnotation::Inconclusive;

return TxOutChangeAnnotation::NotChange;

}

debug_assert!(vout < outputs.len());

let weights: Vec<u32> = outputs

.iter()

.map(|out| decimal_hamming_weight(out.value().to_sat()))

.collect();

let weights: Vec<u32> = tx

.outputs()

.map(|out| decimal_hamming_weight(out.value().to_sat()))

.collect();

if weights.len() <= 1 {

return TxOutChangeAnnotation::NotChange;

}

debug_assert!(vout < weights.len());

This looks like we can drop the outputs allocation. We only use it to compute weights and check the output count so building weights directly from tx.outputs() should be a bit leaner.

arminsabouri · 2026-04-21T13:41:30Z

+
+        let mut max_other = 0;
+        let mut low_weight_count = 0;
+        let mut other_count = 0;


Can this be re-written as

Suggested change

let mut other_count = 0;

let mut other_count = weights.len() - 1;

Then you can also move the other_count check up:

if other_count == 0 { return false; }

arminsabouri · 2026-04-21T13:57:44Z

+    const LOW_WEIGHT_THRESHOLD: u32 = 2;
+    /// Minimum gap between a candidate's weight and the next-highest weight
+    /// for the candidate to clearly stand out as change.
+    const MIN_OUTLIER_GAP: u32 = 2;


Did you take a look at what other projects set these consts to (blocksci)?

arminsabouri · 2026-04-21T14:01:19Z

+
+        if Self::is_candidate(&weights, vout) {
+            TxOutChangeAnnotation::Change
+        } else if target_weight <= Self::LOW_WEIGHT_THRESHOLD


Shouldn't this check come first? before the is_candidate check?

arminsabouri requested changes Apr 7, 2026

View reviewed changes

Comment thread src/crates/heuristics/src/change_identification.rs Outdated

Comment thread src/crates/heuristics/src/change_identification.rs Outdated

0xZaddyy force-pushed the roundness branch 3 times, most recently from c513640 to 7a16423 Compare April 7, 2026 16:03

0xZaddyy requested a review from arminsabouri April 7, 2026 16:11

0xZaddyy marked this pull request as ready for review April 7, 2026 17:33

0xZaddyy mentioned this pull request Apr 7, 2026

fiat round payments in change detection heuristics #49

Open

arminsabouri requested changes Apr 7, 2026

View reviewed changes

arminsabouri mentioned this pull request Apr 7, 2026

ProbablyNotChange variant + probabilistic notion of change detection #50

Open

0xZaddyy changed the title ~~Add RoundNumberHeuristics for change identification~~ Add RoundNumberHeuristics for change identification Apr 7, 2026

0xZaddyy force-pushed the roundness branch from 5132552 to ad53946 Compare April 7, 2026 21:18

0xZaddyy requested a review from arminsabouri April 7, 2026 21:24

0xZaddyy force-pushed the roundness branch from ad53946 to 5154cc7 Compare April 8, 2026 02:10

bc1cindy reviewed Apr 8, 2026

View reviewed changes

Comment thread src/crates/heuristics/src/change_identification.rs

0xZaddyy force-pushed the roundness branch from 5154cc7 to c3455c9 Compare April 8, 2026 14:14

arminsabouri reviewed Apr 8, 2026

View reviewed changes

Comment thread src/crates/heuristics/src/change_identification.rs Outdated

0xZaddyy force-pushed the roundness branch from c3455c9 to 76a8cf8 Compare April 8, 2026 17:13

arminsabouri requested changes Apr 14, 2026

View reviewed changes

0xZaddyy force-pushed the roundness branch from fe0eff9 to dddedc5 Compare April 14, 2026 16:53

arminsabouri mentioned this pull request Apr 15, 2026

Implement script type matching change heuristic #55

Merged

Mshehu5 mentioned this pull request Apr 15, 2026

mark change as inconclusive using the variant from #48 #56

Closed

arminsabouri mentioned this pull request Apr 15, 2026

Return non-conclusive change in output type checks #57

Open

0xZaddyy added 2 commits April 15, 2026 19:30

add Inconclusive variant to TxOutChangeAnnotation

72cfa10

Introduce an `Inconclusive` state to represent cases where a heuristic does not have enough evidence to classify an output as change or not.

Refactor RoundNumberHeuristics to use candidate-based outlier detection

3dc5ce0

Require a sole high-weight outlier against a mostly-round background before flagging change. Return `Inconclusive` when the signal is ambiguous.

0xZaddyy force-pushed the roundness branch from f673fda to 63b0639 Compare April 15, 2026 18:39

make round-number heuristic context-aware

768689f

Introduce clearer `Change` / `NotChange` / `Inconclusive` semantics, add relative comparison for non-change detection, and remove unnecessary allocations in candidate evaluation.

0xZaddyy force-pushed the roundness branch from 63b0639 to 768689f Compare April 15, 2026 18:40

Mshehu5 suggested changes Apr 20, 2026

View reviewed changes

arminsabouri reviewed Apr 21, 2026

View reviewed changes

	let mut other_count = 0;
	let mut other_count = weights.len() - 1;

Conversation

0xZaddyy commented Apr 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

0xZaddyy commented Apr 7, 2026

Uh oh!

Uh oh!

Uh oh!

arminsabouri commented Apr 7, 2026

Uh oh!

arminsabouri commented Apr 7, 2026

Uh oh!

0xZaddyy commented Apr 7, 2026

Uh oh!

arminsabouri left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

0xZaddyy commented Apr 7, 2026

Uh oh!

arminsabouri commented Apr 8, 2026

Uh oh!

Uh oh!

Uh oh!

0xZaddyy commented Apr 14, 2026

Uh oh!

arminsabouri left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Mshehu5 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

0xZaddyy commented Apr 6, 2026 •

edited

Loading