diff --git a/.gitignore b/.gitignore new file mode 100644 index 0000000..a2e10b4 --- /dev/null +++ b/.gitignore @@ -0,0 +1,3 @@ +*.html +*.pdf +*.css diff --git a/00_overview.md b/00_overview.md new file mode 100644 index 0000000..d208a1a --- /dev/null +++ b/00_overview.md @@ -0,0 +1,463 @@ +# TODO + +- clearly delineate the terminal nodes of the decision tree +- list any open questions, directions for future or ongoing work +- rejig motivation and background sections into front matter + +# Overview + +## Introduction + +This repository describes several related protocols and strategies that can be used in the protocols to optimize privacy, specified in adjacent files. + +This document introduces these components through a series of strawman constructions. Each one solves a different problem introduced by the previous one. This builds up to comprehensively address the broader problem of transacting privately on Bitcoin. + +## Informal Problem Definition + +Transacting privately means that the funding or spending of a specific transaction output must be plausibly attributable to a sufficiently large number of [wallet clusters](https://spiralbtc.substack.com/p/the-scroll-2-wallet-clustering-basics) under reasonable assumptions. + +Since clustering techniques are [powerful, diverse, and evolving ](https://spiralbtc.substack.com/p/the-scroll-3-a-brief-history-of-wallet), and especially because KYC information can be the basis of accurate clustering information which in general is not observable by users, a quantitative treatment is needed in order to facilitate informed decisions about the level of anonymity each user requires. This must account for the [surrounding context](https://spiralbtc.substack.com/p/the-scroll-4-intersection-attacks), the transactions that precede or succeed a transaction of interest on the graph. Considering a single transaction at a time is insufficient. + +Constructing transactions that provide privacy this way inherently requires the participation of multiple parties, or there is no crowd for individuals to blend in. The purpose of these protocols is to allow the honest parties to agree on the inputs they intend to spend and the outputs they intend to create, with no arbitrary restrictions and without linking any one input or output to any other. Only honestly added outputs are allowed, ones whose cost is covered by input derived funds. + +Beyond that this protocol suite does not take a strong normative stance with respect to the specific strategy employed for optimizing privacy. Although specific recommendations are made to improve privacy, following those recommendations is a matter of incentives and client policy, not protocol rules. + +- ~0 marginal cost +- support arbitrary payments or net settlements, with counterparty privacy + - concept of mix is wasteful + +- motivation + - why not joinmarket? + - joinstr? + - payjoin? + +# BIP 77 PayJoin + +BIP 77 is 3 round protocol between two cooperating parties, a sender and a receiver, which provides both a limited form of privacy improvements and blockspace savings opportunity. + +## Protocol Overview + +The receiver initiates the protocol by sharing a payment URI that indicates BIP 77 is supported. It is assumed that the receiver has a communication channel to the sender that provides confidentiality and integrity. + +Next, the sender replies with a fully signed transaction, sent directly to the receiver (instead of broadcasting). + +The receiver then modifies this transaction to add its own inputs, and do as it desires with the payment output, and signs its inputs. The sender's original signatures are not valid on this modified transaction, so this partially signed transaction is then sent back. + +Finally, after filling in the missing signatures the sender can broadcast the payjoin transaction, completing the protocol. + +Apart from the initial URI, all protocol messages are end to end encrypted, and delivered on ephemeral mailboxes hosted on a public server, accessed using OHTTP in order to hide the peers' IP addresses. This relies on the OHTTP relays not colluding with each other or the server in order to link the metadata of the parties to the protocol to each other in the interaction. This provides no protection against a [global passive adversary](https://sites.cs.ucsb.edu/~ravenben/classes/595n-s07/papers/anon-diaz.pdf) performing traffic analysis. Such an adversary would also likely be able to link both parties traffic to a specific Bitcoin transaction being broadcast. + +## Protocol guarantees + +For the protocol to succeed both parties must cooperate successfully. Such peers are called honest. No faults can be tolerated in the success path, because each peer is a single point of failure. Faulty peers are peers that deviate from the protocol. Being a faulty node does not necessarily imply malice. + +Byzantine behavior, which consists of arbitrary deviations from the protocol, can still tolerated in the sense that its harms are limited. + +To start, Bitcoin consensus can ensure that transactions only proceed with unanimous consent. This means both parties funds are safe, neither party can misappropriate funds that are under the control of the other party at the start of the protocol. This does not guarantee the receiver will be paid, the same caveats as regular on chain Bitcoin payments apply. + +The fully signed transaction provided by the sender lets the receiver opt out of the protocol and still get paid. This is important for automated receivers, which need to defend against UTXO probing, where a malicious sender would initiate payments but not follow through, in order to costlessly enumerate the UTXOs of the receiver. + +By cooperating, both parties can benefit from improved privacy and the receiver can also reduce their on chain footprint. In this two party setting these incentives are sufficient. + +## On Chain privacy + +To a 3rd party observer such a transaction should look like a typical unilateral payment transaction. If it unconditionally applies the common input ownership heuristic, such an observer would incorrectly conclude that the inputs of the sender and the receiver belong to the same cluster. For such transactions there will be at least 3 interpretations, in the two input two output case, one interpretation is where everything belongs to one cluster, and two interpretations with two clusters involving one input and one output, as there are two matchings. + +Even in the best case the privacy guarantees of this structure are fairly weak, so while it is strictly better than not using such a protocol these guarantees are both weak and brittle. + +## Problem: when $n = 2$, $(n-1) = 1$ + +Since each party knows what inputs and outputs it controls, everything else can be attributed to the counterparty. This implies that both parties trust the other with clustering data for the involved coins. + +Increasing the number of parties can provide better privacy, thereby reducing the need for counterparty trust. If Alice pays Bob in a larger multiparty transaction in a way that Bob doesn't know which inputs Alice used to do this, and whether or not one of the outputs is her change output, then he no longer has privileged information about Alice's wallet cluster. + +# Multi sender, single receiver payjoin + +At the cost of one more round trip, where the senders submit signatures for their inputs, BIP 77 can be [modified](https://github.com/payjoin/rust-payjoin/pull/923) to support multiplexing of several senders' payments to one shared receiver. + +In this variation the receiver still initiates by sending URIs to multiple receivers in expectation of a payment. The receiver coordinates the entire transaction, similarly to the taker role in JoinMarket. As such it is trusted by the senders to behave honestly. + +## Problem: only improves receiver privacy + +In this modification the receiver can benefit from improved privacy, no senders should be able to distinguish the receiver's inputs from the other senders' inputs, but this does not extend to sender privacy as the receiver still knows which inputs and outputs are linked to which sender. + +# True multiparty payjoin in the honest threat model + +If instead of the receiver multiplexing several sessions of sender-receiver payjoin around a shared receiver the protocol was fully symmetric, then no party would be uniquely privileged or disadvantaged. + +Assuming each peer can broadcast messages to all other peers, a simple protocol with 3 main phases can be defined: + +- First each party broadcasts its TxIns. +- Then TxOuts are broadcast and collected, finalizing the transaction. +- Finally each TxIn may be signed by its owner, and the signatures are broadcast + +The first peer to obtain a full set of signatures can broadcast the transaction on the Bitcoin peer to peer network. + +Concretely such a broadcast channel can be instantiated in a number of ways, perhaps [iroh gossip broadcast](https://docs.iroh.computer/connecting/gossip), or ephemeral [encrypted group chats](https://github.com/nostr-protocol/nips/blob/master/EE.md). There are many viable alternatives. In the honest threat model we assume all parties are trusted, not just to avoid disrupt or abuse the protocol, but also with maintaining privacy, and not retaining any information about links between inputs or outputs of the resulting transaction. + +# True multiparty payjoin in semi-honest threat model + +In the semi honest model all parties are assumed to be honest but curious. This means they can be trusted to follow the protocol rules, but don't need to be trusted with privacy. + +If the broadcast mechanism is anonymous, then linking individual inputs or outputs to each other should not be feasible. Concretely this can take many forms. For now, just imagine peer to peer protocol where peers are fully connected to each other. An anonymous handshake using a shared secret known to all parties can be used for authentication of new connections. This makes it possible to post a message anonymously: a peer can connect to one of the others over Tor, for example, submit a message and disconnect. The receiving peer would then broadcast the message to all peers nothing will directly link this message to any other message. + +- 3 parties qualitatively different from 2, makes semi-honest threat model possible, decreased burden on honesty assumption wrt privacy but increased wrt liveness. n-1 deanonymization attacks always a contingency. + +- transport layer must protect privacy (no iroh, because no metadata privacy) + - tor + - i2p + - nym + - katzenpost? + - payjoin directory service + +- dc nets? anonymous broadcast? + - generally costlier primitives, very strong anonymity properties per message + - https://dl.acm.org/doi/pdf/10.1145/3372297.3417261 + - https://eprint.iacr.org/2022/1548.pdf + - qualitatively different privacy, does not require non colluding 3rd party servers, lack of metadata from the transport layer privacy does not affect anonymity within the set (but may be a concern on its own so something like tor or OHTTP is still desired) + +## Problem: Invalid or unfair transactions may be created + +- safety - no unfair transactions, only include honestly proposed inputs and outputs +- liveness - inputs and outputs proposed by honest peers will get included (eventually, with restrictions) +- define SMR? + +Although privacy is improved by anonymously broadcasting the pieces of the transaction, this sacrifices both safety and liveness. The semi-honest model is too trusting. If one of the parties claims more than their fair share in TxOut value, this may result in an invalid transaction, or theft of the intended mining fees of the other parties, or some of the honestly proposed outputs not being included. + +Since outputs are posted anonymously, even if such an attack is detected there is no mechanism to exclude the malicious peer. + + - problem: tx might not be fair + - some may add txouts whose effective cost exceeds the effective values of their txins + - if not too greedy and others agree, this is theft of transaction fees + - otherwise this is an attack on liveness + - total txout value > txins makes a tx invalid even if signed + - if others know the tx isn't fair, then they shouldn't sign + +# Multiparty payjoin with message validity + +Unfair TxOut additions can be prevented by requiring every TxOut broadcast to include a zero knowledge proof that this TxOut's effective cost (its value plus the cost of its blockspace) is covered by funds originating from one or more the inputs, without linking the output to the origin of the funds. + +Although this can be accomplished generically using multiparty computation or zkSNARK compilers can be used, the communication complexity or prover complexity will be higher than what a more specialized approach can achieve. More efficient sigma protocols are sufficient for instantiating a mechanism like e-cash or privacy preserving blockchain are possible. + +A relatively straightforward approach is to use homomorphic value commitments to represent satoshi values, similarly to WabiSabi. Each input's effective value can be distributed into such commitments. To utilize the value in a commitment, a proof with a nullifier is created. The nullifier protects against equivocation, allowing each commitment to only be consumed once. The balance proof allows the value to be aggregated with potentially those of other commitments, and redistributed into a new set of commitments. Range proofs and a balance proof protect the integrity of any newly minted commitments. A balance proof with a negative delta $v$ can be used to consume committed funds in order to create a txout with effective cost $v$. + +In order to ensure the transaction does not exceed any size limits, a fixed allocation per initial input can be distributed using homomorphic value commitments, similarly to the satoshi amounts. Any any protocol messages can then be required to "spend" these in order to consume block space. + * [ ] requirement to "spend" these to consume blockspace in the transaction. + +In order to break the links between inputs and outputs, the exact commitment used from the set must remain hidden. Ring signatures, or 1-of-n proofs (for example [curve forests](https://eprint.iacr.org/2024/1647.pdf)) prove such statements with respect to an explicit list of commitments. This proving knowledge of the opening one (or more) of the commitments, as well as proving that various relations with respect to the committed value, namely the balance proof, valid nullifier, etc. + +As an alternative to ring signatures, publicly verifiable anonymous credentials could be used to implicitly prove that the commitment id one of a set of commitments seen by all or some threshold number of the parties, with essentially the same proofs as in the ring signature approach for the other relations. Such schemes typically require pairing or general zk. + +Two equivocated transaction outputs, different outputs covered by the same funds, will both carry valid proofs, and so can be broadcast to different peers, initially causing different peers to include different outputs. However, since the proofs ensure that the nullifiers reveals such equivocation, one (so long as it's chosen deterministically) or both can be retroactively ignored by those peers. If both are struck from the transcripts of honest peers, they would be absent in the unsigned transaction. Since its output(s) will not be included, if the equivocating party signs with its inputs then its funds would potentially be burnt as mining fees. This compels it not to sign, which in turn allows the inputs of byzantine peers peers to be removed by the honest parties. + +In addition to detecting equivocations, [DAPS](https://eprint.iacr.org/2017/1203) may be employed to directly reveal (at least one of) the offending party's private keys on equivocations (those used for signing protocol messages, not the spending keys). Such a private key an efficient proof of equivocation, that peers can share in order to expel byzantine nodes. + +By restricting output additions transaction construction can be guaranteed to be agreeable to all honest peers. This is because each party retains full control over the precise allocation of their input funds, and by definition honest peers succeed in disseminating their outputs to other honest peers, ensuring they have no reason not to sign. + +With message validity, and due to the requirement for unanimous signing, the safety property becomes trivial to satisfy. Honest peers will sign transactions with outputs derived only from provably correct, and non-equivocated protocol messages. This leaves only transaction inputs as potentially malicious payloads. Any deviation from this protocol is therefore no more disruptive than omission faults. Due to the unanimity requirement the most disruptive omission fault is failing to produce a signature at the end, since that preclude any early termination by the honest peers. As the last message, signature omission is equivalent to a crash fault. + +Note that the output validity proofs can be optimistically skipped or deferred. If any remaining sats, which a peer intends to go towards mining fees, are explicitly (but anonymously) announced, then as the remaining balance (either sats or vbytes) hits 0, each peer can account for its funds in full, and acknowledge that they accept this transaction. If all parties acknowledge, agreement has been reached and signing can commence, with no need for any proofs. It's only if this balance becomes negative that all transaction outputs must be proven valid, ensuring any malicious attempts to over-spend can be removed and progress towards a valid transaction can be made by the honest peers. + +## Problem: byzantine peers may disrupt convergence by honest peers on an unsigned transaction + +Somewhat confusingly, in this section even though peers were not trusted to be honest with regards to message contents, they were still assumed to be honest with regards to the dissemination of those messages. Message validity merely ensures that if consensus can be reached, the agreed transaction will be fair. As yet, nothing ensures consistency (as defined by the agreement property of distributed consensus) in the presence of faulty peers. + +Dealing with omission failures in principle is conceptually simple: exclude the non-responsive parties' inputs and retry agreeing on the outputs, with the validity proofs being valid with respect to the reduced input set. However, without a byzantine fault tolerant broadcast mechanism, malicious peers may be able to cause the honest peers' view of the unsigned transaction or the signatures to diverge. In other words, a byzantine peer may cause an honest peer's valid messages to be omitted, which amounts to denial of service for that honest peer. + +# Multiparty transaction construction using trusted coordinator + +A trusted server may be used to coordinate transaction construction. Because of the unanimity requirement, the coordinator need not be trusted with custody of funds at any point. If [properly implemented](https://groups.google.com/g/bitcoindev/c/CbfbEGozG7c/m/hDx-EOJvCAAJ), a coordinator need not be trusted to maintain privacy. The coordinator is trusted with liveness in the protocol, and therefore can censor or disrupt at will, and with plausible deniability. + +## Problem: "trusted" coordinator is not trustworthy + +This section justifies the design decision to avoid a centralized coordinator primarily on non-technical grounds and may be skipped with no consequence to understanding. + +Centralized CoinJoin coordination may work in the sense that a significant volume of transactions has been constructed that way. However, so far, every single centrally coordinated CoinJoin protocol has been broken in one way or another: + +- Sharedcoin + - privacy: fully trusted coordinator + - privacy: on chain privacy broken by sub-transaction model (TODO cite coinjoin sudoku, Maurer et al) +- Wasabi zerolink (RSA) + - privacy: tagging + - DoS: blind signature stockpiling (static key) +- Wasabi zerolink (blind Schnorr) + - privacy: tagging (key consistency) + - DoS: nonce reuse in first version, wagner attack attack in second version + - (Also DoS due to server misconfigurations) +- Whirlpool zerolink (Samourai, Ashigaru) + - privacy: xpubs + - privacy: tagging by mixid + - privacy: tagging by blind signing key + - "fixed" in ashigaru fork of the protocol, but signatures are still not validated by client + - ashigaru still ignoring https://x.com/not_nothingmuch/status/1945978442345779317 + - DoS: no domain separation on ownership proofs +- Wasabi 2 WabiSabi + - privacy: tagging + - lontivero acknowledges attack but claims it can't be fixed, nopara, david deny it exists + +Every one of these implementations requires trusting the coordinator with privacy to a significant extent, if not completely, despite claims to the contrary by vendors and proponents. +Denial of service protection was not realized, despite that being the primary purpose of the coordinator. In all cases the cryptographic aspects of the protocol amounted to little more than theater due to either broken protocol design, or inconsistencies between protocol design and client implementations. + +Making things worse, none of these implementations accounts for privacy loss due to [intersection attacks](https://spiralbtc.substack.com/p/the-scroll-4-intersection-attacks), despite this privacy concern having been [described](https://arxiv.org/pdf/1708.04748) well before the existence of current centralized offerings. This is in addition to a number of other on chain deficiencies, such as careless coin selection in both regular transactions and for CoinJoin transactions. + +Malicious coordinators have been observed in the wild [exploiting weaknesses](https://github.com/orgs/WalletWasabi/discussions/13249) in the WabiSabi protocol and client implementation. Even if the still remaining flaws were addressed, it is inherently hard to protect against censorship by coordinators. An accountability mechanism for decentralized reputation might be a potential approach to address that, but would introduce significant complexity. + +Even if the tagging issues in whirlpool were addressed, the claim that coordinator fees are "anti sybil" are misleading, the exceptionally high coordination fee rate allows a malicious coordinator to subsidize the liquidity costs and mining fees required to perform $(n-1)$ deanonymization attacks with the revenue stream from coordination fees. Ironically claim is technically accurate because the coordinator must be trusted with privacy as well, so it can deanonymize costlessly, but note that this is purely a deterrent against sybil attacks by other input owners, something that is unnecessary in the UTXO model (mining fees provide sufficient unforgeable costliness) and which does not provide the colloquial notion of "sybil attack" in the context of CoinJoins, i.e. a costless deanonymization attack. + +Fees paid by consenting users, and understood to be in the service of privacy, have unfortunately not gone towards fixing these flaws. Instead, among other things, were awarded to DoS attackers (through successful extortion) as well as funded misleading marketing efforts, which arguably includes a years long twitter feud that has alienated users and fostered a cult like mentality with regards to Bitcoin privacy technology among the remaining proponents. + +The evidence for centralized CoinJoins being flawed and predatory is overwhelming, despite not being inherent it appears inevitable. It is not just harmful to directly exploited users, but to privacy and fungibility as a whole. Weak privacy is antithetical to censorship resistance, the chilling effects of surveillance lead to self-censrship, and it is antithetical to self sovereign custody due to the lack of informed consent and perhaps more importantly because such misinformation exposes users to unnecessary risk with real safety concerns, as indicated by the alarming rise of $5 wrench attacks. + +This market failure is unacceptable. We therefore require a decentralized protocol in the spirit of JoinMarket, permissionless and market based, free to the extent possible of any perverse incentives or rent seeking opportunities. JoinMarket technically uses centralized coordination, since the taker coordinates any particular transaction, but since this isn't a third party, and the taker is both benefitting from and paying for the transaction, incentives are aligned. That said, we aim to improve on JoinMarket too, in terms of privacy, scalability and costs. + +# Multiparty transaction construction using BFT CRDTs + +So long as there is a deterministic procedure for ordering the inputs and outputs, the order in which they are received is immaterial, given the same set of constituents, peers will converge on the same transaction. In other words, transaction construction can be fully described in terms of a [G-set conflict free replicated data type (CRDT)](https://en.wikipedia.org/wiki/Conflict-free_replicated_data_type#G-Set_(Grow-only_Set)). + +Work by Martin Kleppmann and Heidi Howard introduced [byzantine fault tolerant (BFT) eventual consistency](https://arxiv.org/pdf/2012.00472) and [BFT CRDTs](https://martin.kleppmann.com/papers/bft-crdt-papoc22.pdf) in the asynchronous communication model (where messages can be delayed by an arbitrary amount, discussed in the next section in more detail). This result implies that so long as the honest parties are able to disseminate information, they can make progress and eventually convergence even in the presence of an byzantine peers. This work provides the strong eventual consistency property, which loosely means that eventually the honest peers' states will converge. Since the state is defined as a CRDT, it is guaranteed states which have diverged can always be merged. + +Unlike equivocation of Bitcoin transactions (i.e. double spends), which are prevented by miner determined precedence, equivocations in transaction construction can simply invalidate both conflicting messages. Since this is symmetric, equivocations also do not break the CRDT semantics. Such removal can be modeled either as a [two phase set](https://en.wikipedia.org/wiki/Conflict-free_replicated_data_type#2P-Set_(Two-Phase_Set)). See section 2.4 of the second paper, by Kleppmann. + +Each peer processes the these state updates in an arbitrary order, executing protocol outputs this sequence of updates for processing. The state is obtained by applying these outputs in that order, and because of the CRDT properties, if all updates are eventually delivered to all peers, their states will eventually be identical. This says nothing about their instantaneous states. + +Concretely, in Kleppmann and Howard's approach a cryptographic hash is used to construct a distributed causal log of messages, and this is used to facilitate efficient set reconciliation even in the presence of byzantine faults. We will return to set reconcilation in later sections. + +This establishes a baseline of what can be achieved. As different approaches for improving liveness are discussed in the following sections bear in mind that any of those protocols that the strong eventual consistency is sufficient for the success path. This holds even in the presence of a dishonest majority and in the asynchronous communication model, one of the most severe setting in which honest parties might hope to cooperate. + +The catch is that "eventually" says nothing about how long this can takes. Success depends on the ability of the honest peer to actually communicate with each other. For example if one of the peers is taken offline indefinitely, so long as it can eventually comes back online, that is merely a transient delay and doesn't fit the definition of either a crash fault or omission fault, which technically require that either this peer is permanently offline or that some of its messages are permanently lost. Message validity does make possible somewhat stronger guarantees than the model in the paper (and similarly to the improvements of [later work](https://arxiv.org/pdf/2402.08068 +)) since for transaction construction can enforce either or both of a rate limit and a total message limit, preventing malicious peers from flooding with messages to delay convergence. The terminal states (coatoms of the lattice) are those where the remaining balance is 0 or all inputs have indicated they are ready to sign, which may or may not require validity proofs. + +## Problem: BFT CRDTs only ensure strong eventual consistency + +While strong eventual consistency is a useful property, there isn't much that can be said about the intermediate states prior to convergence other than that they converge, and more importantly there isn't anything that can be said about timeliness of convergence, because message delivery may suffer unbounded delays. + +For a economic transactions that require privacy, the costs incurred by unbounded delays may be prohibitive. In the worst case unbounded delays impose a choice is between completing a transaction without privacy or waiting indefinitely. + +A robust protocol should ideally provide stronger liveness or termination guarantees, so that the honest parties will be able to make progress towards producing a valid transaction which contains all of the outputs they intend to add, and that they are able to succeed in doing so in a timely manner. This extends to network partitions, if communication between two disjoint subsets of the honest breaks down then either subset should be able to fall back on agreement only within the subset, even if strong eventual consistency implies that eventually communications will resume. + +# Multiparty transaction construction using leader based BFT consensus + +## State machine replication + +State machine replication (SMR) has a much stronger notion of consistency than eventual consistency. In this model all peers (typically called replicas) are required to apply updates in exactly the same order. This fully determines all intermediate states. Depending on protocol specifics some of the peers may lag behind the others. Of course this implies eventual consistency as well. + +State machine replication is often defined in terms of iterated consensus. In consensus protocols, honest peers will output the same value, and for SMR to coordinate the next update. Whereas under eventual consistency they can apply valid updates immediately after they are received, waiting for consensus requires additional coordination among the peers, so that each honest peer can rule out the possibility of disagreement. + +As we have seen, multiparty transaction construction does not depend on the updates being totally ordered for convergence. This implies that SMR is overkill for this application. That said, for now we will set this observation aside, and just note a problem related to the consensus problem, [generalized lattice agreement](https://dl.acm.org/doi/pdf/10.1145/2332432.2332458) ([BFT variant](https://arxiv.org/pdf/1910.05768)), which has been used to define a variant of SMR that relies on the commutativity of updates, much like eventual consistency. This model has stronger consistency guarantees than eventual consistency offers, but weaker ones than consensus. Unlike consensus, lattice agreement can still be deterministically solved in the asynchronous communication model, where we will now turn out focus. + +## Communication models + +When communications are disrupted, relying on consensus this can impede progress substantially compared to protocols that make weaker guarantees. Different kinds of disruptions are characterized using qualitatively different communication models. + +In the synchronous communication model, messages from honest peers will be delivered in a timely manner. This can be made concrete by relying on timeouts, treating unresponsive peers as faulty. This setting tolerates a dishonest majority (c.f. Dolev-Strong byzantine broadcast). While simple and practical in the centralized setting, if the coordinator is replaced with a leader based protocol and leaders may be faulty, relying on timeouts and synchronized clocks is problematic, especially in heterogeneous networks. For this protocols designed for the [partial synchrony](https://groups.csail.mit.edu/tds/papers/Lynch/jacm88.pdf) model are typically preferred for decentralized systems. + +Under partial synchrony and in the asynchronous model, a protocol can proceed at rate limited by the underlying network and honest parties' ability to communicate, for example by utilizing a threshold clock. However, any kind of liveness or even termination assurance relies on the number of faulty peers being bounded. Depending on the communication model and on what properties are desired from consensus, this bound may be as tight as 0 as in the famous [FLP impossibility result](https://dl.acm.org/doi/pdf/10.1145/3149.214121). + +## Leader based BFT consensus + +Various byzantine fault tolerant consensus protocols have been described in both the asynchronous and partial synchrony models. Many practical protocols rely on leader based consensus, ensuring liveness for $n \geq 3f+1$ participants so long as no more than $f$ of them are faulty. Some notable examples include [honey badger](https://eprint.iacr.org/2016/199.pdf) and the [hot stuff](https://arxiv.org/abs/1803.05069) family of protocols. + +In leader based protocols, one of the peers is designated as the leader at any point in the protocol. If the leader becomes faulty, a new leader is chosen by the protocol ensuring that eventually one of the honest peers can become the leader, and progress can be made. + +## Why not both? + +In the transaction construction setting, the requirement for unanimity makes safety trivial, even a single honest peer retains veto power. This does not depend on consistency, but is enforced directly through the signature mechanism itself (barring oddities such as `SIGHASH_SINGLE` or `SIGHASH_ANYONECANPAY`). + +It follows that bolstering the consistency guarantees is best understood as a way of improving liveness over the strong eventual consistency baseline. If a complete set of signatures for the same unsigned transaction has been collected by at least one peer and broadcast on the Bitcoin peer to peer network, the protocol has concluded successfully. But if it hasn't, while waiting potentially indefinitely, the active peers may attempt to terminate the protocol earlier via a recovery path. + +BFT consensus makes such a recovery path well defined, signing can commence even if only $(n-f)$ peers are responsive and have posted their outputs successfully. Although this won't produce a valid transaction, since at least $f$ inputs almost certainly will not be signed, the honest parties will be able to retry transaction construction with those inputs removed. + +In the success path, on top of strong eventual consistency, consensus over the fully signed transaction provides confirmation to all of the honest peers that any one of them will be able to broadcast it on the Bitcoin network. + +## Problem: Asymmetric communication burden for leader scales poorly + +The practical limit on a single transaction size in Bitcoin is either the standardness limit, 100KvB, or the 1MvB block size limit. This is sufficient for hundreds of inputs and outputs, and by extension hundreds of participants in a single transaction. + +A multiparty transaction construction protocol focused on privacy should support scaling hundreds of users per transactions. The bigger the crowd, the better the privacy. With a small number of users privacy is very brittle, and the privacy benefits of adding inputs quickly compound. However, eventually the returns start diminishing. When a transaction is sufficiently ambiguous, and large enough so as to be well connected to other such transactions, the marginal contribution of yet another input may be negligible. + +Transaction size limits are not a hard constraint on the scale of multiparty transactions more generally. By utilizing multisig covenant emulation and carefully managing back out paths and transaction dependencies, arbitrarily large transaction graphs can be constructed, which are semantically equivalent to a single transaction that would be too large to be valid. Such constructions are widely believed to have important implications for scaling: virtual UTXO based constructions are a special case of multiparty transaction construction where it is desirable for $n$ to be significantly larger than hundreds. While several such protocols already exist, so far they have relied on centralized coordination and provide no privacy guarantees. + +The ultimate constraint on scale is reliability, which due to the unanimity requirement diminishes at an exponential rate. Let $p$ be the probability that any one peer experiences no faults during a run of the protocol, the probability of none of them experiencing a fault is $p^n$. Suppose $p$ is 99.9%, with $n=100$ only a 90% probability that no peer will experience a fault remains, and it's less than 37% for $n = 1000$. + +The unanimity requirement demands that no such fault occurs in order to successfully complete a run of the protocol. This is because any fault implies needing to prune the input set, which requires agreement over the output set to be established from scratch (because any proofs would only be valid with respect to the input set before it was pruned). + +For overly large values of $n$, success might be achievable but rare enough that it rarely succeeds in practice. Significantly below that there is sweet spot of scale, which maximizes $n$ while maintaining an acceptable failure rate. The precise value of $n$ strongly depends on network reliability, the rate of byzantine behavior, and many other factors, but as stated above it should be at least on the order of hundreds. + +While even poorly implemented centrally coordinated coinjoins have demonstrated that $n$ on the order of hundreds is achievable, in the decentralized setting this is more costly for peers who can't rely on communicating with just one trusted coordinator. If one of the peers is selected as a leader, the additional communication overhead. Coordinating such a transaction also requires sharing information about the previous outputs, the message validity proofs, etc, and the leader must broadcast all of this information to all parties, which for large transaction any be too taxing on the chosen leader, necessitating failover and incurring significant communication overhead. + +Since leader based consensus imposes asymmetric resource utilization, placing more of burden on the leader, another useful probability to think about is that of the leader being able to communicate successfully during one run of the underlying consensus protocol. Naively the probability of consensus succeeding under this leader is $q^n$, if the probability of error was independent, but network congestion is likely to cause correlated transmission failures for a leader that lacking sufficient resources in reality things will fare worse. Although such leader faults do not break the unanimity requirement, they delay progress of the overall protocol and consume more resources from all peers, which in turn negatively $p$. + +For the scales needed to make a positive impact on privacy, leader based consensus is far from optimal. + +# Multiparty transaction construction using leaderless BFT consensus + +Symmetric consensus algorithms place an equal burden on all peers (typically referred to as validators). + +The [DAG rider](https://arxiv.org/pdf/2102.08325) family of algorithms is of particular interest due to its conceptual simplicity. A causal log is used, much like in Kleppmann and Howard's approach, as if maintaining threshold clock based CRDT for leader election, which determines the consensus state. + +- [Recent work](https://arxiv.org/pdf/2506.13998) +- gradecast? BBCA? Chitu? Orca? asymmetric trust? + +- can exploit CRDT structure, c.f. lattice agreement above + - orderless chain related but not relevant, more concerned with programming interface and designed for continuous operations not one shot +- problem: costly communication, not friendly to mobile clients + +# Improving broadcast efficiency + +Recall the complete graph topology implied by our broadcast channel abstraction. + +In practice, not all peers will be publicly reachable, or have sufficient bandwidth. Metadata privacy and reliable communications are harder on mobile clients, requiring [creative workarounds](https://primal.net/e/nevent1qqs2470jrlr4e6ek9yxmhnkl420mt80qu3snr4fsv3tpn545cgj49eg9x35m5). + +- best effort broadcast + - erasure coding +- gossip/epidemic broadcast - most efficient (constant number of connections per peer, not linear), least resillient + - robust overlay networks to avoid fully connected topology + - efficient set reconciliation + +- stronger guarantees build on top of this: + - reliable broadcast + - atomic broadcast + - consistent broadcast + - BBCA? + +In order to decrease the communication burden, broadcast protocols have that rely on erasure coding have been described in the literature. This allows the communication complexity to be reduced from cubic overall (in the number of peers) to quadratic (linear per peer). + +- gossip, set reconciliation, set union consensus + - byzantine set union consensus (set reconciliation + gradecast) https://grothoff.org/christian/consensus2016.pdf + - minisketch + - rateless set reconcilation, rateless IBLT and certainsync + - compare with bloom filter approach of Kleppmann 22 + +- brahms +- overlaybb +- https://dl.comp.nus.edu.sg/server/api/core/bitstreams/16a47ad4-bfbe-4146-bc30-916c9c691539/content + +questions: +- open gossip (e.g. ownership proofs etc) vs. per session broadcast are qualitatively different +- can peers not involved in a session still be relied upon for driving session progress, or only for coalition formation related gossip? + +# public infrastructure for assisting constrained clients + +- allow some clients to opt out of being validators? +- payjoin directory service as semi-trusted party, providing a broadcast channel + - efficient set reconcilation with linear communication per mobile client? + - rate limiting credentials per UTXO for rate limiting writes to directory broadcast channels +- problem: bft consensus requires n >= 3f+1. dishonest majority or even just more than f malicious parties can disrupt liveness, denying service to honest parties + - addressed to a reasonable extent by async fallback with BFT CRDTs and rate limiting total communication + - with network partitions, DAPS and robust overlay network structure are best effort to allow honest parties on either side of a partition to +- problem: privacy still brittle if only co-spending with payjoin counterparties, wallet clustering based on fingerprints + +# generalized coinjoin via open protocol enrollment followed by bft multiparty txn construction + +- TODO strawman order book model + +- more diverse counterparties can improve privacy, but comes with risk of them being malicious +- full generality says nothing about privacy guarantees + - a cost function is needed to make good decisions within the protocol +- recommended structure: + - [theoretical basis](https://github.com/nothingmuch/tx-graph-anonymity-sets/) + - radix coinjoin, with [recommended values](https://colab.research.google.com/drive/1We_FvfX_Ob9BapFW3X_By9vTtxUrt3pm) + - some similarties to wasabi 2, important differences: + - allow high hamming weight outputs for payments (for use in payjoin, or for more blockspace efficient decomposition) + - TODO describe: + - brute force search for fail closed, randomizable strategy + - efficient subset sum density estimation both for precompution or during critical phase + +- problem: not sybil resistant, attacker can flood the order book and do (n-1) attack on honest clients + +# generalized coinjoin with randomization mechanism + +[verifiable randomization mechanism](https://gist.github.com/nothingmuch/f5b9a559958c6116606d9da0d4d884f2) provides sybil resistance and improved graph properties, as well as useful symmetry breaking properties for the protocol in both the low volume (up to one tx per block) and high volume (more than one tx per block) regimes + +problem: incentive alignment for bootstrapping protocol unclear/intractable + +# coalition formation protocol + +- how does it relate to active participants in unknown participants setting + +[coalition formation](https://github.com/payjoin/multiparty-protocol-docs/pull/1) reduces generally hard coalition formation to simpler incremental bilateral negotiations + +aligns incentives for txn construction + +--- + + +## Background: wallet clustering & 2-party PayJoin + +PayJoin was created as a response to the wallet clustering concern. Bitcoin transaction transaction outputs can be clustered together, labeling a group of TXOs as belonging to the same wallet. Clustering allows any blockchain external information linked to one coin in a cluster, such as personally identifying information from KYC requirements, to be associated with any other coin in the cluster. + +Clustering is a problem for Bitcoin. For businesses this can reveal information to competitors. For individuals, especially with self-custody, it is about personal safety and freedom from surveillance. And for the system as a whole it degrades fungibility and censorship resistance. + +Clustering techniques have advanced significantly over the years and continue to improve. The oldest of these is the common input ownership heuristic, also known as the multi-input heuristic, which assumes that coins that were spent as inputs to the same transaction are owned by the same entity. CoinJoin transactions are created in order to contradict this assumption, but this heuristic can be refined to filter obviously multiparty transactions. + +Existing PayJoin protocols (BIPs 79-77) allow two parties to collaboratively construct a transaction. Such transactions are not as easily distinguished from "regular" payment transactions (where the CIOH is be accurate), but since they are multi- party transactions they cast doubt on the heuristic applied to similar transactions. + +Since PayJoin is a two party protocol, the counterparty is necessarily trusted with regards to privacy. Each party knows which inputs and outputs belong to it, and after eliminating those only the counterparty's inputs and outputs remain. + +## Motivation + +### Privacy + +A PayJoin transaction with 3 or more parties reduces the counterparty trust with regards to privacy. Suppose Alice is paying Bob, Bob is paying Carol, and Carol is paying Alice. Alice doesn't need to know which of the inputs to such a transaction belong to Bob and which Carol. By the same logic, Bob wouldn't know which input belongs to Carol, nor if Carol is paying Alice, possibly utilizing the funds from his payment to do so. + +With regards to 3rd party observers, with improved clustering techniques the privacy of PayJoin transactions degrades. For example, by utilizing wallet fingerprint based techniques to cluster coins, a PayJoin transaction that would otherwise lead two clusters to be incorrectly collapsed into one could be filtered out if the clusters appear too distinct based on their associated fingerprints. This flags PayJoin transactions, with context clues singling them out from the background of "regular" on-chain payment transactions. Furthermore, if the outputs of such a transaction can be linked to the inputs then the payment amount can be inferred. With additional parties involved, the task of linking inputs a PayJoin transaction to each other, or the outputs to the clusters of the inputs, both become more difficult. + +### Blockspace savings + +Multiparty PayJoin provides the potential for more additional blockspace savings over PayJoin. If $n$ parties all transact with each other on chain that would require $O(n^2)$ block space, but they can instead coordinate and create a single net-settlement transaction with size $O(n)$ with the same outcome. + +RBF cut-through. custer mempool. + +If cross input signature aggregation is enabled on Bitcoin, full aggregation would require more or less the same interaction as multiparty transactions, and incentivizes collaboration because it requires only $O(\frac{1}{n})$ witness data per participant. + +Similarly with any kind of UTXO sharing, such as payment channels or offchain vUTXOs, on chain payments can still be supported as a kind of splicing or cooperative exit operation again through interactivity. + +--- + +The receiver initiates the protocol, providing the sender with a payjoin enabled payment URI (BIPs 21, 321). The sender replies with a fully signed payment transaction, delivered over a peer to peer communication channel instead of by broadcasting to the network. The receiver at that point can opt-in to replacing the transaction, with their inputs as well, and replies to the sender with all of the signatures for the receiver's adding their inputs and signing, and replying to the sender + +replies with a fallback transaction. This is a unilateral, fully signedThis transaction as would be created by a sender without payjoin support. + +- clustering is the problem +- multi user transactions are the solution + - coinjoin w/ robust theory of anonymity sets is maxxing version of that + - payjoin not it: + - requires couinterparty trust + - anonymity set size is small + - ... + +# CoinJoin constraints + +with `SIGHASH_ALL` (and without `SIGHASH_ANYONECANPAY`, which is a malleability issue, or other hypothetical sighash flags), all parties must sign the same transaction and then combine their signatures, or the transaction can't be broadcast on the bitcoin network + +this means that all parties must come to agreement about what transaction to sign + +if the total txout amount exceeds the total txin amount, the transaction will not be valid. therefore addition of txouts must be restricted + +if only the total txout amount is restricted, some users may include txouts exceeding the inputs they are spending, while other users would not be able to get theirs in, so restriction of txout addition must be fair and accountable, ensuring that txin funds cover txout funds per user + +for privacy (aspect of safety), txouts must not be linkable by other parties in the transaction, even in the semi-honest setting (fine in the honest setting), so txout restriction can't rely on knowing which inputs are related to the input + +if all of this is satisfied, i.e. the honest parties are able to come to agreement about an unsigned transaction without compromising their privacy, then no honest user should have a reason not to sign the resulting transaction. + +for liveness txouts must only be included if covered by txin funds, this ensures the unsigned txn could be valid if it were signed, and that every + + +--- + +permissionless network -> new gossip network where peers can start communicating TODO rephrase + +how do we get there? the best we have is UTXOs but we can't assume 1 utxo = 1 peer, an adversary can make many small UTXOs, so we can't assume anything about n >= 3f + 1 + +- circular dependency between these elements needs to be broken by bootstrapping +- ability to verify UTXOs and BIP 322 ownership proofs +- ownership proofs certify listen advertisements, which include metadata + privacy preserving authenticated endpoint, suitable for establishing + pairwise channels: i2p destination, tor hidden service, directory mailbox, etc... +- peer to peer channels allow construction of an overlay network +- specifically we are interested in robust overlay networks, resistant to + dishonest majority and tolerating high churn, for example random walk + based peer sampling +- anti-entropy or set reconcilation over peer channels makes efficient gossip + possible, allowing all parties to share the set of ownership proofs and + listen advertisements diff --git a/02_coalition_formation.md b/02_coalition_formation.md new file mode 100644 index 0000000..f5e7e56 --- /dev/null +++ b/02_coalition_formation.md @@ -0,0 +1,245 @@ +# Market-based coalition formation for collaborative transaction construction + +## Problem definition + +In several existing market-based decentralized protocols involving collaborative transactions, a single party initiates and typically bears the costs — for example, JoinMarket or liquidity ads. + +Transactions meeting the demand of more than one "taker" can be more efficient, depending on their desired outcomes. CoinJoin privacy in particular is a non-rivalrous positive externality obtained by each participant from the other participants, and as such, it's an inherently positive sum. + +The following protocol aims to generalize the type of order matching JoinMarket supports, allowing multiple users' intents to be aggregated together. + +## Overview of proposed solution + +This document describes a permissionless, peer-to-peer protocol for negotiating the initiation of a collaborative transaction construction. This takes place over two primary phases: bilateral negotiation, followed by aggregation. Successful execution of the protocol results in a coalition of UTXO owners unanimously agreeing to build a transaction together. + +This isn't a transaction construction protocol (e.g. WabiSabi), but a precursor to such a protocol. This protocol bootstraps a compatible transaction a protocol by finding consensus on initial set of UTXOs whose (honest) owners intend to spend in a collaborative transaction. Refer to the last section for a brief discussion of such a protocol and how it composes with this one. + +### Bilateral negotiation + +A proposer, the owner of one or more UTXOs, selects UTXOs belonging to others from an "order book". The proposer initiates peer-to-peer bilateral negotiations with the owners of these coins by sending them a message that includes a *co-spend proposal*. This expresses an intent to cooperate to spend this set of coins together in a single Bitcoin transaction. + +Co-spend proposals are publicly verifiable, confidential adjustments to the effective values of the specified coins. These adjustments can represent arbitrary payoff structures. Whether a proposal takes effect (i.e. the adjustment is applied) is contingent on the constraints specified in the proposal being satisfied in a transaction construction session intended on spending the specified UTXOs. For example, a proposal may require that non-SegWit inputs be excluded for TxID stability, or that the feerate be within an acceptable range, among other constraints. + +Co-spend proposals rely on cryptography to keep each coin's adjustment hidden, except for that coin's owner and the proposer. The proposer's UTXO is indistinguishable from the other UTXOs. Before a proposal is unanimously accepted, it isn't known which UTXO owners have accepted the proposal. + +### Aggregation + +Co-spend proposals aren't exclusive. Many proposals can be aggregated together before collaboratively constructing a transaction in which the combination of all the aggregated proposals is in effect. The UTXOs specified in proposals that are aggregated together may overlap but don't need to; the main requirement is for the constraints to be compatible. + +Fully accepted proposals can be broadcast and then aggregated with other compatible proposals. Aggregation makes it possible for multiple participants to simultaneously optimize for their desired outcome according to their individual preferences. When they value privacy, or any other kind of positive sum interaction, aggregation generates surplus from which all participants may benefit. + +To aggregate, several proposals are bundled together with a *coalition proposal* that all parties must agree to. Unanimous acceptance of such a coalition proposal signals that everyone is ready to construct a transaction. + +Because transactions are only valid if all inputs are signed, multiparty transactions require unanimous agreement. A precondition for denial of service protection in privacy preserving multiparty transaction construction protocols is incentive compatibility for the honest participants. No honest participant should have a reason to withold their signature. + +The primary purpose of this protocol is to construct an [imputation](https://en.wikipedia.org/wiki/Imputation_%28game_theory%29), which is a payoff distribution over the inputs. This aligns incentives for successful transaction construction while maintaining confidentiality of the individual payoffs. + +## Background + +### Hedonic games and games with transferrable utility + +Coalition formation games output a partition over $N$ peers. Finding stable equilibria in such games is generally difficult. Depending on the structures in the preferences of the peers, many special cases described in the hedonic games literature are still hard. In the byzantine setting appropriate for Bitcoin protocols, no such structure can be assumed, which makes finding stable partitions intractable. + +Since the participants in the protocol unilaterally control independent UTXOs during the execution of this protocol, we're constrained to the non-cooperative setting, where agreements are non-binding and can't be enforced. Additionally, for privacy, the cost functions should remain hidden so that no mechanism can access the peers' preferences and globally optimize. Note that due to the nature of revealed preferences, the cost function can't be fully hidden. + +Fortunately, the introduction of a transferrable utility greatly simplifies finding stable coalition structures, even under these assumptions. Bitcoin can approximate such a transferrable utility. This makes it possible to simply assign payoffs to the coalitions and allow them to be distributed among the peers through bilateral negotiation. If Alice prefers not to co-spend with Bob, but Carol strongly favors both Alice's and Bob's participation, Carol is able to compensate Alice, so that as long as the coalition generates some surplus, all parties will be able to obtain a positive payoff. + +### Transaction construction liveness assumptions + +For a coalition of $n \leq N$ peers, $f$ of which are adversarial, liveness for transaction construction can be achieved (with privacy) by the honest subset of peers in $O(f)$ time. Any defection requires transaction construction to start over. Due to random network disruptions, necessarily some rate of (apparent) defection must be tolerated. + +Because exclusivity of spending attempts by inputs can't be enforced without global consensus, and because transaction replacement is desirable for fee market efficiency, we assume inputs are allowed to participate in concurrent sessions, which will result in conflicting transactions. Opt-in signalling can be used and obeyed as a courtesy, but this ultimately reduces to the same liveness model as transaction construction, potentially requiring additional attempts to recreate the transaction. + +### Cost functions + +This protocol is mostly agnostic to the cost function that describes the peers' preferences. + +Much like in the problem of coin selection (which this work aims to generalize), the cost function is denominated in sats, and it's the combination of objective terms (i.e. the fee cost) and subjective ones (any positive utility obtained from transacting, which by definition must dominate over the objective costs for a self-interested, not necessarily honest peer to agree to sign the transaction). + +Although out of scope for this specific document, privacy-related terms are assumed quantify things like how much cover other peers' coins provide in Maurer et al's the sub-transaction model, or an estimate the cost to an adversary an $n-1$ deanonymization attack (for instance, an adversary using older coins incurs a higher cost to do such an attack, as that liquidity is must be effectively locked up to become old). + +A precondition for liveness is that any such cost function is monotonically decreasing in new information revealed during transaction construction. + +Concretely, this means that whenever the action of some peer is revealed (i.e. when an input or output is added), at worst, other peers should be indifferent to this, but they may also obtain positive utility. If this isn't the case, parties may rationally choose to defect. The purpose of the constraints in proposals is to allow the honest subset peers to avoid such non-malicious conflicts a priori. Since monotonicity is required for liveness, a peer that engages in the protocol without adhering to this restriction and refuses to sign is deviating from the protocol, and therefore by definition not a member of the honest subset. + +## Simplified example + +Note that this example omits many details discussed below. + +Alice, Bob, Carol, and Dave are owners of UTXOs $A$, $B$, $C$, and $D$, respectively. Alice wants to CoinJoin with Bob and Carol. She creates a co-spend proposal with UTXOs $\{ A, B, C \}$. In this proposal, she adjusts her UTXO $A$ down by 100 sats, and adjusts $B$ and $C$, perhaps up by 50 sats each, providing an incentive for Bob and Carol to accept. She sends this to Bob and Carol, who accept, after which the proposal is fully accepted and broadcast. + +For all Bob knows, either Carol or Alice made the proposal, and the same goes for Carol, not knowing if it was Alice or Bob. Bob only learns that he was offered 50 sats; he doesn't learn how much Carol (or whomever) was offered. All Bob knows is that the 50 sats, which he would receive if a transaction was constructed with this proposal in effect, would be paid for by one of the UTXOs $\{A, C\}$. + +Carol also wishes to CoinJoin with Alice and Dave. She creates a co-spend proposal with UTXOs $\{ A, C, D \}$, and adjusts $C$ down by 50 sats, $A$ up by 20, and $D$ up by 30. This too is unanimously accepted and broadcast. + +Dave then notices both proposals are compatible, and chooses to aggregate them together. He produces a coalition proposal, which depends on these two proposals. The coalition proposal chooses a specific feerate, among other things, combining the constraints of both previous proposals. It names $\{ A, B, C, D \}$ as the UTXOs. + +Once all parties accept Dave's proposal, consensus has been bootstrapped. Everyone can derive the same total adjustments: $A - 80$, $B + 50$, $C + 0$, $D + 30$. In general, only the owner of each coin knows their final adjustment, but in this example, Carol knows that Dave obtained $+30$ because her proposal is the only one that applies to his coin, and likewise Alice knows Bob obtained $+50$. + +All peers can then proceed to transaction construction, where each party is entitled to add up to their adjusted value's worth in outputs. They may add additional inputs and arbitrary outputs, so long as for each peer, the sum of the outputs' effective cost doesn't exceed the sum of the adjusted effective values of that peer's inputs. + +## Technical details + +### Setup: BIP 322 gossip, online key enrollment + +Every spendable coin could potentially be spent in a multiparty transaction (although not all coins can be co-spent). To facilitate this, we require a gossip mechanism for BIP-322 ownership proofs, with a replacement and expiration mechanism for flood protection, committing to a public key referred to as the *online key* for the coin. + +The purpose of this online key is to authenticate the owner of the UTXO without requiring them to access the spending keys for communication. Anonymous authentication requires the use of ring signatures (and are therefore less likely to be supported by a HWW than BIP-322 proofs), and also simplify the mapping to just a single public key per UTXO, regardless of the complexity the `scriptPubKey` might have (complex multisig or ambiguity from multiple spend paths). + +Ownership proofs are also used to estimate input weights. For P2TR outputs, this indicates the spend path intended to be used in any collaborative transactions. Note, however, that this isn't enforceable. + +Each proof's endorsement of the online key has a validity interval specified in terms of Bitcoin block height or MTU (valid-after and valid-until). When multiple proofs by the same UTXO are nominally in effect, the one with the latest expiry time takes precedence. Ownership proofs commit to a block hash whose height or MTU is within some set interval of the valid-after field. + +#### Flood protection + +Online keys may need to be rotated. A rate limiting mechanism is needed for ownership proofs, and more generally for other protocol messages. + +Ownership proofs are tied to UTXOs that provides some degree of Sybil protection, since creating UTXOs incurs a cost. However, this is insufficient for flood protection, since there is no restriction on how many ownership proofs can be created. + +Note that Sybil resistance for flood protection is a separate concern from Sybil resistance in the context of CoinJoin $n-1$ deanonymization attacks on CoinJoins. The latter is assumed to be encoded of the cost function, and therefore out of scope for this protocol. + +To be accepted (and propagated through gossip) by a peer, any newly made proof associated with a coin must have a hash value (e.g. wtxid of the BIP 322 `to_sign` virtual transaction) numerically smaller than all the other proofs already associated with that output which are known to the peer. + +Because ownership proofs may be valid at disjoint time intervals, a peer should store up to $k$ proofs (in total, not per validity time interval) in its gossip set for each candidate UTXO, so long as the hashes of *all* of these are numerically smaller than $c_1 + 2^{(c_2 d k)}$, where $d$ is the total duration of all proofs, and $c_i$ are dynamically set policy values (similar to `minrelayfee`, based on local resource limits). + +#### Precedence ordering + +In addition to rudimentary flood protection for the ownership proof gossip, this defines a clear precedence order. This is so that with respect to a specific block tip, even without global consensus on the set of all ownership proofs, it's possible to efficiently arrive at the unambiguous mapping from a particular set of UTXOs to the same set of online keys by sharing with peers any relevant proofs they might be missing. + +If the online keys are generated by hash chain (similar to LN revocation keys), revealing the key can revoke a previous ownership proof, forcing a linear sequence of key updates, and providing a mechanism for efficient revocation. However, it's unclear at the time of writing whether revocation is required at all, and the proof of work approach to flood protection seems sufficient. + +Similarly, rate limiting could be done by restricting the rate of broadcast on the basis of the committed block hash, but on its own, this wouldn't establish a precedence order, and in the absence of consensus on the set of ownership proofs, it may result in ambiguities about which online key to use. + +### Negotiation: Listen advertisements and/or gossip of partially accepted proposals + +Short-lived listen advertisements may be signed by "makers" using their online keys to indicate that the owner of a UTXO is soliciting proposals pertaining to its spending over the named communication channel. Peers should only gossip listen advertisements for currently active or soon-to-be-active online keys. The hash of the signature is used as a flood protection mechanism, similar to ownership proofs. + +"Takers" evaluate listening coins and can construct proposals (defined in the next section) over the specified communication channels. + +Alternatively, negotiation with listening but unaddressible UTXO owners can be made possible using an opt-in gossip layer. On this layer, it's permitted to broadcast partially accepted proposals. See below for a discussion of flood protection considerations in this setting. + +Listen advertisements can specify constraints for proposals being entertained, indicating that proposals outside of the constrained subspace (see coalition proposal details below for specifics) are unacceptable at any price. + +Finally, listen advertisements also indicate the owner's willingness to serve as a validator node in transaction construction, as either demanding to be a one, opting in at the aggregator's discretion, or declining. The aggregator has an incentive be a validator (and to do so honestly), but is not required to. + +### Co-Spend proposals + +Co-spend proposals specify a set of outpoints of coins intended to be spent together. Fully accepted proposals take effect contingent on the conditions they specify, such as the range of feerates the offer is valid under, `nLocktime` ranges, or whether or not txid stability is required (i.e. only SegWit inputs). The effect of a proposal is some confidential redistribution of the input funds, e.g. fees negotiated between the transacting parties. + +#### Making and ratifying co-spend proposals + +A proposal specifying $n$ UTXOs can have $1 \leq m \leq n$ linkable ring signatures by the associated online keys of the named UTXOs. + +While $m < $n$, a proposal is only partially accepted. Which of the peers have accepted a partial proposal is kept hidden by the ring signatures. This includes keeping the proposer anonymous within this set of signers; technically, the proposer is just the first to accept, and at least one peer needs to accept for anti-Sybil flood protection reasons. + +If $m=n$ linkable ring signatures with distinct key images have been collected, then a proposal has been unanimously accepted. + +Finally, for a proposal to be fully accepted, the $n$ linkable ring signatures are replaced with a single Schnorr signature made by the MuSig2 aggregate key formed from the set of associated online keys of the UTXOs. This reduces the size, as well as the computational cost of verifying proposals. Because all parties have accepted, no single party can deny having signed. This makes a joint multisignature semantically equivalent to the $n$ individual linkable ring signatures. The deniability of linkable ring signatures only makes a difference during negotiations before all parties have accepted. + +Fully accepted proposals can then be broadcast for aggregation. A fully accepted proposal indicates that all owners of the specified UTXOs have some interest in this proposal being in effect. + +Individual proposals commit to a specific set of BIP 322 ownership proofs so that the signatures on them are verifiable. Proposals that refer to expired ownership proofs are still considered valid, and more generally, proposals that commit to different ownership proofs are potentially valid as long there is a non-expired ownership proof associated with the UTXO, which is also used for input weight estimations. See the coalition proposal section below for details. + +Co-spend proposals aren't updatable or revocable once fully accepted, but they do have an expiry (specified in block height, MTU, or UTC time). More than one signature on the same proposal should be considered a flooding attack by all participants, and honest parties should only sign (MuSig2) a proposal once. Again, see below for discussion of flood protection with respect to partially accepted proposals. + +However, co-spend proposals are able to explicitly include conflict hints for other proposals, indicating that (at least) the proposer of the conflicting proposal wouldn't accept a coalition proposal that depends on the excluded proposals. + +#### Effective value adjustments + +Proposals verifiably redistribute sats among the UTXOs by adjusting their effective values. Corresponding to each UTXO, a homomorphic value commitment to an adjustment term is included. This adjustment can implement fee payments between peers, and it may be positive or negative for every UTXO. + +The sum of all the adjustment value commitments in a proposal is a commitment to the sum of the adjustments. This value, the *surplus* of a proposal, is included in cleartext, and the sum commitment is proven to commit to it in a balance proof. This too may be positive or negative. + +When communicating a partial proposal, the proposer shares with each recipient the opening of the commitment associated with their UTXO. This payoff is denominated in sats, and it can be positive or negative. + +Each adjustment value commitment is also covered by a range proof certifying that the adjustment is a small positive or negative integer. Small means at least $\lfloor\frac{v}{log_2(v)}\rfloor$, where $v$ is the minimal effective value among all specified UTXOs. This minimum value is computed by taking the highest acceptable feerate in the proposal conditions and multiplying that by the input weights estimated from the ownership proofs. + +The sum of the adjustment commitments, when tweaked by the effective value of the UTXO according to the final feerate of the UTXO, must be a commitment to a positive number of sats. This is ensured by accounting for the maximum values in the range proofs, and by ensuring it doesn't overflow the effective value. + +Imposing a minimal range width size imposes a limit on how many proposals can be aggregated together, roughly logarithmic in the total value to be spent. Proposals can use wider range proofs than the minimum, which limits their ability to be aggregated and leaks information about the size of the adjustments. + +### Proposal aggregation and condition details + +A coalition proposal is a special kind of co-spend proposal that aggregates together a bundle of non-conflicting, fully accepted co-spend proposals. The coalition proposal is made by one of the peers, the aggregator, to *all* other peers implicated in the aggregation. The aggregator must also set specific values for any constrained parameters (e.g. a concrete feerate and not just a range). + +#### Making and ratifying coalition proposals + +Any peer may attempt to construct a coalition proposal by aggregating unanimously accepted co-spend proposals together, so long as it controls at least one of the online keys implicated in the aggregation. For rate limiting, the aggregator's key is made explicit, and the hash of this signature is used for flood control when gossipping the partially signed coalition proposal, much like the hash of ownership proofs. + +Unlike co-spend proposals, coalition proposals are accepted with a regular signature by the online key. This makes them tractable for aggregators to make coalition proposals that revise the payoff or omit peers who didn't accept. As discussed above, coalition proposals aren't mutually exclusive. If a peer rejects a coalition proposal due to the inclusion of a specific proposal, it may broadcast a counter final proposal of its own, so there's no mechanism for explicit rejection. + +A coalition proposal also specifies a concrete transaction construction protocol version, and commits to a specific set of listen advertisements associated with the specified UTXOs, which have either demanded or opted into serving as validators. These peers agree to allow other peers to connect to them and facilitate in gossip. Depending on the liveness requirements of byzantine agreement for the subsequent transaction construction protocol, the aggregator may specify validators at their discretion and named parties may accept or decline. + +#### Bootstrapping consensus for transaction construction + +Once a coalition proposal is fully accepted, a coalition has been formed, complete with a payoff imputation. The online keys are then used to initiate a consensus protocol to construct the full transaction. + +Since the coalition proposal addresses every UTXO, its ownership proof set commitment is authoritative. This determines the canonical online key set for the coalition, allowing a consensus protocol to be initialized. The aggregator should use precedence ordering for symmetry breaking to minimize additional gossip and improve the chances of the coalition proposal being unanimously accepted, but there's no consensus state for this data at the time the coalition proposal is being authored. The purpose of coalition proposals is to bootstrap both the consensus protocol and incentive compatibility to make use of it. + +#### Coalition proposal structure + +The surplus of all included co-spend proposals is added to the coalition proposal's adjustment vector, which covers all UTXOs. The coalition proposal's surplus must be positive and needs to cover the shared transaction fields at the set feerate, but otherwise, the aggregated surplus is at the discretion of the aggregator; it may be claimed as a coordination fee, redistributed among all parties arbitrarily, or included in the mining fees. + +The logical conjunction of co-spend proposals implies the union of their adjusted UTXOs and the intersection of their conditions is in effect. The per-UTXO adjustments are collected for each outpoint. The total adjustment for each UTXO may not overflow its effective value, based on the range proof widths. + +Co-spend proposals therefore form a commutative semi-group, unioning the sets of UTXOs and summing the adjustments component wise. It's only a semi-group because proposals can't always be aggregated together. One reason for this is that the conditions might be mutually exclusive. Another is consensus rules (e.g. `OP_CLTV` making conflicting assertions). + +Conditions are intersected elementwise. The intersection of mutually exclusive conditions — for example, a time-based vs. a height-based nLocktime — is empty, precluding the combination of such proposals. + +The following are all numerical intervals under intersection: + +- Transaction consensus fields + - version + - individual flag bits (top is {0,1} then {0}, {1}, bottom is {}) + - nIns + - not included: nOuts, (min cannot be enforced and so can't be included, max is enforced indirectly by vbytes allocation see below) + - nSequence (constrains all nSequence fields) + - nLocktime +- Non-consensus parameters + - feerate + - max total vbytes per tx (divide by nIns and round down for allocation per input, nIns must be set accordingly to pass standardness or consensus limits) + +The following are sets under intersection: + +- Allowed input types (catchall "unknown" or "arbitrary" as one of the variants, variants for standard or well known output types) +- Allowed output types + +### Flood protection for partially accepted proposals + +Partially accepted proposals can be communicated directly only between involved peers or gossipped to third parties. Gossip of partial proposals enables a single proposer to make many nuisance offers involving different sets of keys. Normally linkable ring signatures will protect their identity, which means that other nodes can only use a statistical approach to detect the culprit and may mistakenly penalize the UTXOs the attacker names, without their owners contributing to this flooding attack. "Normally" means the key image generator point is just be a hash to curve of the proposal data itself, allowing a practically unlimited number of ring signatures per UTXO (exponential in the size of the candidate UTXO set). + +To limit this behavior, the linkable ring signature key image base point can be time dependent (e.g. making use of a sufficiently buried block's hash as a randomness beacon) and include a truncated hash of the proposal when choosing the generator (by hashing to the curve). Truncating to 10 bits, for example, would allow up to 1,024 unrelated proposals to be accepted or made by a single UTXO equivocation per timestep, allowing flooding to be prevented. Note, however, that honest users may be restricted from accepting some honest proposals due to the birthday bound, but honest proposers with knowledge of other proposals can grind the proposal data to avoid collisions, and we of course assume attackers can do this without restriction). + +Additionally, it's possible to make use of a verifiable encryption scheme for the openings of the commitments, allowing verification of the proposal during gossip, but this doesn't prevent the broadcast of nuisance proposals that wouldn't be accepted despite being formally valid. + +In contrast, in a semi-honest setting, and if acceptors don't mind revealing whether they accept to the proposer, the linkable ring signatures can be shortcut entirely and MuSig2 negotiated directly, with the proposer as the coordinator between all named parties. MuSig2 signing should still be initiated concurrently, even when linkable ring signatures are used to minimize the number of round trips. + +### PayJoin-like payments in multiparty transactions + +Payments requests or intents may be encoded in the co-spend proposals using the value adjustment mechanism. Even if fully accepted, such proposals are merely signalling the intention to pay, because finality is contingent on successful collaborative transaction construction. + +The sender and receiver must arrange for proposals to transfer the payment amount. This can be achieved by the receiver extending the sender's proposal with additional UTXOs (potentially of unrelated parties, although at least one UTXO should be the receiver's in order to claim the adjustment, and multiple proposals may be required to represent the full payment amount due to the range proof width restriction). The sender and receiver must collaborate to produce a balance proof, as only one or the other knows the openings of some of the homomorphic value adjustments. + +With this approach, fallback transactions as in BIPs 79-7 are no longer as prudent, since the receiver's UTXOs are much more difficult to probe. Neither sender nor receiver needs to disclose any UTXOs to each other; only the payment amount is fully known to both. + +Although out of scope for this document, peers also negotiate the payment details out of band and arrange for valid registration of the receiver outputs during transaction construction. In this setting, the receiver doesn't need to contribute a UTXO at all and may still get privacy from the sender, since their output(s) may be ambiguous, and there are no limitations arising from the use of width limited range proofs. + +### Post quantum security and privacy + +Apart from the BIP-322 ownership proofs, which are related to spending keys, the signatures used throughout don't require PQ security for privacy (assuming perfectly hiding commitments) or safety, but only for liveness. + +However, for privacy, it's important that any proofs of knowledge of openings of the commitments don't compromise the unconditional hiding of commitments, so as to avoid "harvest now, decrypt later" attacks on anonymity (although out of the scope of this document, this is also required for transaction construction). + +Secondly, in addition to the secp256k1 online key, a PQ key for end-to-end encrypted communication (e.g. the encrypted openings of commitments included in a proposal) can also be used. HPKE is an obvious choice for secp256k1-based encryption, and hybrid HPKE is specified using ML-KEM, which makes it a relatively simpler change. However, neither of these is verifiable without general purpose ZK, which is often costly for the prover or in proof size. Lattice-based schemes (without a symmetric cipher) seem potentially amenable to such verifiable encryption, but I have yet to look into that. Verifiable encryption is necessary for partial proposal gossip, as otherwise there's no guarantee that the proposal can even be accepted, as some parties may not learn the opening of the value commitments associated with their UTXOs. + +Finally, PQ privacy must of course also assume that metadata privacy at the transport layer is also preserved. + +### Transaction construction protocol + +A protocol similar to WabiSabi can provide liveness, although in this setting, WabiSabi's reliance on KVACs would require one of the peers to serve as the coordinator, and it would be trusted. With modification, this coordinator could prove all transcript messages it accepted are valid, but may still arbitrarily censor certain operations, causing a UTXO to be unfairly blamed for failing to sign. + +Technically the coalition proposal can implement a coordination fee mechanism — for example, by making all adjustments slightly negative, apart from that of the coordinator (which may or may not be the aggregator). This provides the coordinator with an incentive to coordinate honestly on average. Unfortunately, this does little for any targeted censorship risk, so long as the payoff from censorship (of e.g. a specific TxOut) dominates over the expected loss of revenue from sabotaging a single attempt at transaction construction. It's likely that the majority of the revenue is still recoverable in a subsequent blame round, as the other participants have no indication that censorship has occurred. + +For a more robust mechanism, coalition formation outputs a set of online keys for a coalition, and among other things, these are intended to be used to implement byzantine fault tolerant state machine replication for transaction construction. The initial state, based on the coalition proposal, can be proven valid with respect to each UTXO's adjustment sum commitments, which enforces that the agreed to imputation is in effect. + +The state machine collects inputs and outputs of a transaction (or tree of transactions a la CoinJoinXT). Since state messages are self authenticating, authenticating using zero knowledge proofs (anonymous credentials or ring signatures), they can be posted anonymously, much like in a permissioned blockchain, where all peers act as validators. diff --git a/chatgpt-github.css b/chatgpt-github.css new file mode 100644 index 0000000..5661d75 --- /dev/null +++ b/chatgpt-github.css @@ -0,0 +1,64 @@ +/* Base */ +body { + font-family: -apple-system, BlinkMacSystemFont, "Segoe UI", + Helvetica, Arial, sans-serif, "Apple Color Emoji", + "Segoe UI Emoji"; + font-size: 16px; + line-height: 1.6; + color: #24292f; + background-color: #ffffff; + max-width: 980px; + margin: 0 auto; + padding: 32px; +} + +/* Headings */ +h1, h2, h3, h4, h5, h6 { + margin-top: 24px; + margin-bottom: 16px; + font-weight: 600; + line-height: 1.25; +} + +h1 { + font-size: 2em; + padding-bottom: 0.3em; + border-bottom: 1px solid #d0d7de; +} + +h2 { + font-size: 1.5em; + padding-bottom: 0.3em; + border-bottom: 1px solid #d0d7de; +} + +h3 { font-size: 1.25em; } +h4 { font-size: 1em; } +h5 { font-size: 0.875em; } +h6 { font-size: 0.85em; color: #57606a; } + +/* Paragraphs */ +p { + margin-top: 0; + margin-bottom: 16px; +} + +/* Links */ +a { + color: #0969da; + text-decoration: none; +} + +a:hover { + text-decoration: underline; +} + +/* Lists */ +ul, ol { + margin-top: 0; + margin-bottom: 16px; + padding-left: 2em; +} + +li +