From 977342a9434cd34a6fe24b097b49fbb8dae659d0 Mon Sep 17 00:00:00 2001 From: Rusty Russell Date: Sun, 29 Mar 2026 14:32:59 +1030 Subject: [PATCH 1/3] Varops: Two BIPs for Script Restoration: varops calculations and tapleaf version (0xc2). Special thanks to Murch for teaching me mediawiki, and so much great formatting and clarity advice. Signed-off-by: Rusty Russell --- .typos.toml | 1 + bip-unknown-script-restoration.mediawiki | 710 +++++++++++++++++++++++ bip-unknown-varops-budget.mediawiki | 353 +++++++++++ 3 files changed, 1064 insertions(+) create mode 100644 bip-unknown-script-restoration.mediawiki create mode 100644 bip-unknown-varops-budget.mediawiki diff --git a/.typos.toml b/.typos.toml index e30e9e6bd4..79fb1312af 100644 --- a/.typos.toml +++ b/.typos.toml @@ -29,6 +29,7 @@ Atack = "Atack" Falke = "Falke" Meni = "Meni" Ono = "Ono" +Toom = "Toom" [files] extend-exclude = [ diff --git a/bip-unknown-script-restoration.mediawiki b/bip-unknown-script-restoration.mediawiki new file mode 100644 index 0000000000..d904678819 --- /dev/null +++ b/bip-unknown-script-restoration.mediawiki @@ -0,0 +1,710 @@ +
+  BIP: ?
+  Layer: Consensus (soft fork)
+  Title: Restoration of disabled script (Tapleaf 0xC2)
+  Authors: Rusty Russell 
+           Julian Moik 
+  Status: Draft
+  Type: Specification
+  Assigned: ?
+  License: BSD-3-Clause
+  Discussion: https://groups.google.com/g/bitcoindev/c/GisTcPb8Jco/m/8znWcWwKAQAJ
+  Version: 0.1.0
+  Requires: Varops BIP
+
+ +==Introduction== + +===Abstract=== + +This BIP introduces a new tapleaf version (0xc2) which restores Bitcoin script +to its pre-0.3.1 capability, relying on the Varops Budget in +[[bip-unknown-varops-budget.mediawiki|BIP-varops]] to prevent the excessive +computational time which caused CVE-2010-5137. + +In particular, this BIP: +* Reenables disabled opcodes. +* Increases the maximum stack object size from 520 bytes to 4,000,000 bytes. +* Introduces a total stack byte limit of 8,000,000 bytes. +* Increases the maximum total number of stack objects from 1,000 to 32,768. +* Removes the 32-bit size restriction on numerical values. +* Treats all numerical values as unsigned. + +All opcodes are described in exact (painstaking) byte-by-byte operations, so +that their varops budget can be easily derived. Note that this level of +detail is unnecessary to users of script, only being of interest to +implementers. + +===Copyright=== + +This document is licensed under the 3-clause BSD license. + +===Motivation=== + +Since Bitcoin v0.3.1 (addressing CVE-2010-5137), Bitcoin's scripting +capabilities have been significantly restricted to mitigate known +vulnerabilities related to excessive computational time and memory usage. +These early safeguards were necessary to prevent denial-of-service attacks and +ensure the stability and reliability of the Bitcoin network. + +Unfortunately, these restrictions removed much of the ability for users to +control the exact spending conditions of their outputs, which has frustrated +the long-held ideal of programmable money without third-party trust. + +==Execution of Tapscript 0xC2== + +If a taproot leaf has a version of 0xc2, execution of opcodes is as defined +below. All opcodes not explicitly defined here are treated exactly as defined +by [[bip-0342.mediawiki|BIP342]]. + +Validation of a script fails if: +* It exceeds the remaining varops budget for the transaction. +* Any stack element exceeds 4,000,000 bytes. +* The total size of all stack (and altstack) elements exceeds 8,000,000 bytes. +* The number of stack elements (including altstack elements) exceeds 32,768. + +===Rationale=== + +There needs to be some limit on memory usage, to avoid a memory-based denial +of service. + +Putting the entire transaction on the stack is a foreseeable use case, hence +using the block size (4MB) as a limit makes sense. However, allowing 4MB +stack elements is a significant increase in memory requirements, so a total +limit of twice that many bytes (8MB) is introduced. Many stack operations +require making at least one copy, so this allows such use. + +Putting all outputs or inputs from the transaction on the stack as separate +elements requires as much stack capacity as there are inputs or outputs. The +smallest possible input is 41 bytes (allowing almost 24,390 inputs), and the +smallest possible output is 9 bytes (allowing almost 111,111 outputs). +However, empty outputs are rare and not economically interesting. Thus we +consider smallest non-OP_RETURN standard output script, which is P2WPKH at 22 +bytes, giving a minimum output size of 31 bytes, allowing 32,258 outputs in a +maximally-sized transaction. + +This makes 32,768 a reasonable upper limit for stack elements. + +===SUCCESS Opcodes=== + +The following opcodes are renamed OP_SUCCESSx, and cause validation to +immediately succeed: + +* OP_1NEGATE = OP_SUCCESS79 +* OP_NEGATE = OP_SUCCESS143 +* OP_ABS = OP_SUCCESS144Anthony Towns suggested this could become an + opcode which normalized the value on the top of the stack by truncating any + trailing zeroes. + +====Rationale==== + +Negative numbers are not natively supported in 0xC2 Tapscript. Arbitrary +precision makes them difficult to manipulate and negative values are not used +meaningfully in bitcoin transactions. + +===Arbitrary-length Values, Endianness, and Normalization of Results=== + +The restoration of bit operations means that the little-endianness of stack +values is once more exposed to the Script author, if they mix them with +arithmetic operations. The restoration of arbitrary-length values +additionally exposes the endianness to the implementation authors (who cannot +simply load stack entries into registers), and requires explicit consideration +when considering varops costs of operations.For example, removing +trailing bytes from a stack element is almost free, whereas removing bytes +from the front involves copying all remaining bytes. + +Note that only arithmetic operations (those which treat operands as numbers) +normalize their results: bit and byte operations do not.Such +non-arithmetic operations can be used to operate on values such as preimages +or (with introspection) parts of transactions, where truncation of zeros would +be unexpected. One could argue that even arithmetic operators should not +normalize, but that would be a gratuitous and surprising change. Note that "0 +OP_ADD" can always be used to cheaply normalize the top stack element. +Thus operations such as "0 OP_ADD" and "2 OP_MUL" will never result in a top +stack entry with a trailing zero byte, but "0 OP_OR" and "1 OP_UPSHIFT" +may.The original Bitcoin implementation had a similar operational split, +but OP_LSHIFT and OP_RSHIFT did normalize, which was almost a requirement +given that they also preserved the sign of the shifted operand + +To be explicit, the following operations are defined as arithmetic and will +normalize their results: + +* OP_1ADD +* OP_1SUB +* OP_2MUL +* OP_2DIV +* OP_ADD +* OP_SUB +* OP_MUL +* OP_DIV +* OP_MOD +* OP_MIN +* OP_MAX + +===Non-Arithmetic Opcodes Dealing With Stack Numbers=== + +The following opcodes are redefined in 0xC2 Tapscript to read numbers from the +stack as arbitrary-length little-endian values (instead of CScriptNum): + +# OP_CHECKLOCKTIMEVERIFY +# OP_CHECKSEQUENCEVERIFY +# OP_VERIFY +# OP_PICK +# OP_ROLL +# OP_IFDUP +# OP_CHECKSIGADD + +These opcodes are redefined in 0xC2 Tapscript to write numbers to the stack as +minimal-length little-endian values (instead of CScriptNum): + +# OP_CHECKSIGADD +# OP_DEPTH +# OP_SIZE + +In addition, the [[bip-0342.mediawiki#specification|BIP-342 success +requirement]] is modified to require a non-zero variable-length unsigned +integer value (not CastToBool()): + +Previously: + +``4. (ii) If the execution results in anything but exactly one element on the +stack which evaluates to true with CastToBool(), fail.`` + +Now: + +``4. (ii) If the execution results in anything but exactly one element on the +stack which contains one or more non-zero bytes, fail.`` + +===Enabled Opcodes=== + +Fifteen opcodes that were removed in v0.3.1 are re-enabled in 0xC2 Tapscript. + +If there are fewer than the required number of stack elements, these opcodes +fail validation. These are popped off the stack in right-to-left order, +i.e. [A B] means pop B off the stack, then pop A off the +stack. + +See [[bip-unknown-varops-budget.mediawiki|BIP-varops]] for the meaning of the +annotations in the varops cost field. + +====Splice Opcodes==== + +{| +! Mnemonic +! Opcode +! Input Stack +! Description +! Definition +! Varops Cost +! Varops Reason +|- +|OP_CAT +|126 +|[A B] +|Append B to A +| +# Pop operands off the stack. +# Append B to A. +# Push A onto the stack. +|(length(A) + length(B)) * 3 +|COPYING +|- +|OP_SUBSTR +|127 +|[A BEGIN LEN] +|Extract bytes BEGIN through BEGIN+LEN of A +| +# Pop operands off the stack. +# Remove BEGIN bytes from the front of A (all bytes if BEGIN is greater than length of A). +# If length(A) is greater than value(LEN), truncate A to length value(LEN). +# Push A onto the stack. +|(length(LEN) + length(BEGIN)) * 2 + MIN(Value of LEN, MAX(length(A) - Value of BEGIN, 0)) * 3 +|LENGTHCONV + COPYING +|- +|OP_LEFT +|128 +|[A OFFSET] +|Extract the left OFFSET bytes of A +| +# Pop operands off the stack. +# If length(A) is greater than value(OFFSET), truncate A to length value(OFFSET). +# Push A onto the stack. +|length(OFFSET) * 2 +|LENGTHCONV +|- +|OP_RIGHT +|129 +|[A OFFSET] +|Extract the right bytes of A, from OFFSET onwards +| +# Pop operands off the stack. +# If value(OFFSET) is less than length(A), copy value(OFFSET) bytes from offset value(OFFSET) to offset 0 in A, and truncate A to length(A) - value(OFFSET). Otherwise truncate A to length 0. +# Push A onto the stack. +|length(OFFSET) * 2 + value of OFFSET * 3 +|LENGTHCONV + COPYING +|} + +=====Rationale===== + +OP_CAT may require a reallocation of A (hence, COPYING A) before appending B. + +OP_SUBSTR may have to copy LEN bytes, but also needs to read its two numeric +operands. LEN is limited to the length of the operand minus BEGIN. + +OP_LEFT only needs to read its OFFSET operand (truncation is free), whereas +OP_RIGHT must copy the bytes, which depends on the OFFSET value. + +====Bit Operation Opcodes==== + +{| +! Mnemonic +! Opcode +! Input Stack +! Description +! Definition +! Varops Cost +! Varops Reason +|- +|OP_INVERT +|131 +|[A] +|Bitwise invert A +| +# Pop operands off the stack. +# For each byte in A, replace it with that byte bitwise XOR 0xFF (i.e. invert the bits) +# Push A onto the stack. +|length(A) * 4 +|OTHER +|- +|OP_AND +|132 +|[A B] +|Binary AND of A and B +| +# Pop operands off the stack. +# If B is longer than A, swap B and A. +# For each byte in A (the longer operand): bitwise AND it with the equivalent byte in B (or 0 if past end of B) +# Push A onto the stack. +|(length(A) + length(B)) * 2 +|OTHER + ZEROING +|- +|OP_OR +|133 +|[A B] +|Binary OR of A and B +| +# Pop operands off the stack. +# If B is longer than A, swap B and A. +# For each byte in B (the shorter operand): bitwise OR it into the equivalent byte in A (altering A). +# Push A onto the stack. +|MIN(length(A), length(B)) * 4 +|OTHER +|- +|OP_XOR +|134 +|[A B] +|Binary exclusive-OR of A and B +| +# Pop operands off the stack. +# If B is longer than A, swap B and A. +# For each byte in B (the shorter operand): exclusive OR it into the equivalent byte in A (altering A). +# Push A onto the stack. +|MIN(length(A), length(B)) * 4 +|OTHER +|} + +=====Rationale===== + +OP_AND, OP_OR and OP_XOR are assumed to fold the results into the longer of +the two operands. This is an OTHER operation (i.e. cost is 4 per byte), but +OP_AND needs to do this until one operand is exhausted, and then zero the rest +(ZEROING, cost 2 per byte). OP_OR and OP_XOR can stop processing the operands +as soon as the shorter operand is exhausted. + +====Bitshift Opcodes==== + +Note that these are raw bitshifts, unlike the sign-preserving arithmetic +shifts in Bitcoin v0.3.0, and as such they also do not truncate trailing +zeroes from results: they are renamed OP_UPSHIFT (née OP_LSHIFT) and +OP_DOWNSHIFT (née OP_RSHIFT). + +{| +! Mnemonic +! Opcode +! Input Stack +! Description +! Definition +! Varops Cost +! Varops Reason +|- +|OP_UPSHIFT +|152 +|[A BITS] +|Move bits of A right by BITS (numerically increase) +| +# Pop operands off the stack. +# If A shifted by value(BITS) would exceed the individual stack limit, fail. +# If value(BITS) % 8 == 0: simply prepend value(BITS) / 8 zeroes to A. +# Otherwise: prepend (value(BITS) / 8) + 1 zeroes to A, then shift A *down* (8 - (value(BITS) % 8)) bits. +# Push A onto the stack. +|length(BITS) * 2 + (Value of BITS) / 8 * 2 + length(A) * 3. If BITS % 8 != 0, add length(A) * 4 +|LENGTHCONV + ZEROING + COPYING. If BITS % 8 != 0, + OTHER. +|- +|OP_DOWNSHIFT +|153 +|[A BITS] +|Move bits of A left by BITS (numerically decrease) +| +# Pop operands off the stack. +# For BITOFF from 0 to (length(A)-1) * 8 - value(BITS): +## Copy each bit in A from BITOFF + value(BITS) to BITOFF. +# Truncate A to remove value(BITS) / 8 bytes from the end (or all bytes, if value(BITS) / 8 > length(A)). +# Push A onto the stack. +|length(BITS) * 2 + MAX((length(A) - (Value of BITS) / 8), 0) * 3 +|LENGTHCONV + COPYING +|} + +=====Rationale===== + +DOWNSHIFT needs to read the value of the second operand BITS. It then needs +to move the remainder of A (the part after offset BITS/8 bytes). In practice +this should be implemented in word-size chunks, not bit-by-bit! + +UPSHIFT also needs to read BITS. In general, it may need to reallocate +(copying A and zeroing out remaining words). If not moving an exact number of +bytes (BITS % 8 != 0), another pass is needed to perform the bitshift. + +OP_UPSHIFT can produce huge results, and so must be checked for limits prior +to evaluation. It is also carefully defined to avoid reallocating twice +(reallocating to prepend bytes, then again to append a single byte) which has +the practical advantage of being able to share the same downward bitshift +routine as OP_DOWNSHIFT. + +====Multiply and Divide Opcodes==== + +{| +! Mnemonic +! Opcode +! Input Stack +! Description +! Definition +! Varops Cost +! Varops Reason +|- +|OP_2MUL +|141 +|[A] +|Multiply A by 2 +| +# Pop operands off the stack. +# Shift each byte in A 1 bit to the left (increasing values, equivalent to C's << operator), overflowing into the next byte. +# If the final byte overflows, append a single 1 byte. +# Otherwise, truncate A at the last non-zero byte. +# Push A onto the stack. +|length(A) * 7 +|OTHER + COPYING +|- +|OP_2DIV +|142 +|[A] +|Divide A by 2 +| +# Pop operands off the stack. +# Shift each byte in A 1 bit to the right (decreasing values, equivalent to C's >> operator), taking the next byte’s bottom bit as the value of the top bit, and tracking the last non-zero value. +# Truncate A at the last non-zero byte. +# Push A onto the stack. +|length(A) * 4 +|OTHER +|- +|OP_MUL +|149 +|[A B] +|Multiply A by B +| +# Pop operands off the stack. +# Calculate the varops cost of the operation: if it exceeds the remaining budget, fail. +# Allocate an all-zero vector R of length equal to length(A) + length(B). +# For each word in A, multiply it by B and add it into the vector R, offset by the word offset in A. +# Truncate R at the last non-zero byte. +# Push R onto the stack. +|(length(A) + length(B)) * 3 + (length(A) + 7) / 8 * length(B) * 27 (BEWARE OVERFLOW) +|See Appendix +|- +|OP_DIV +|150 +|[A B] +|Divide A by (non-zero) B +| +# Pop operands off the stack. +# Calculate the varops cost of the operation: if it exceeds the remaining budget, fail. +# If B is empty or all zeroes, fail. +# Perform division as per Knuth's The Art of Computer Programming v2 page 272, Algorithm D "Division of non-negative integers". +# Trim trailing zeroes off the quotient. +# Push the quotient onto the stack. +|length(A) * 18 + length(B) * 4 + length(A)^2 * 2 / 3 (BEWARE OVERFLOW) +|See Appendix +|- +|OP_MOD +|151 +|[A B] +|Replace A with remainder when A divided by (non-zero) B +| +# Pop operands off the stack. +# Calculate the varops cost of the operation: if it exceeds the remaining budget, fail. +# If B is empty or all zeroes, fail. +# Perform division as per Knuth's The Art of Computer Programming v2 page 272, Algorithm D "Division of non-negative integers". +# Trim trailing zeroes off the remainder. +# Push the remainder onto the stack. +|length(A) * 18 + length(B) * 4 + length(A)^2 * 2 / 3 (BEWARE OVERFLOW) +|See Appendix +|} + +=====Rationale===== + +These opcodes can be computationally intensive, which is why the varops budget must be checked before operations. OP_2MUL and OP_2DIV are far simpler, equivalent to OP_UPSHIFT and OP_DOWNSHIFT by 1 bit, except truncating the most-significant zero bytes. + +The detailed rationale for these costs can be found in Appendix A. + +===Limited Hashing Opcodes=== + +OP_RIPEMD160 and OP_SHA1 are now defined to FAIL validation if their operands exceed 520 bytes.There seems little reason to allow large hashing with SHA1 and RIPEMD, and they are not as optimized as SHA256, so we restrict their usage to the older byte limit. + +===Extended Opcodes=== + +The opcodes OP_ADD, OP_SUB, OP_1ADD and OP_1SUB are redefined in 0xC2 Tapscript to operate on variable-length unsigned integers. These always produce minimal values (no trailing zero bytes). + +{| +! Mnemonic +! Opcode +! Input Stack +! Description +! Definition +! Varops Cost +! Varops Reason +|- +|OP_ADD +|147 +|[A B] +|Add A and B +| +# Pop operands off the stack. +# Option 1: trim trailing zeroes off A and B. +# If B is longer than A, swap A and B. +# For each byte in B, add it and previous overflow into the equivalent byte in A, remembering next overflow. +# If there was final overflow, append a 1 byte to A. +# Option 2: If there was no final overflow, remember last non-zero byte written into A, and truncate A after that point. +# Either Option 1 or Option 2 MUST be implemented. +|MAX(length(A), length(B)) * 9 +|ARITH + COPYING +|- +|OP_1ADD +|139 +|[A] +|Add one to A +| +# Pop operands off the stack. +# Let B = 1, and continue as OP_ADD. +|MAX(1, length(A)) * 9 +|ARITH + COPYING +|- +|OP_SUB +|148 +|[A B] +|Subtract B from A where B is <= A +| +# Pop operands off the stack. +# For each byte in B, subtract it and previous underflow from the equivalent byte in A, remembering next underflow. +# If there was final underflow, fail validation. +# Remember last non-zero byte written into A, and truncate A after that point. +|MAX(length(A), length(B)) * 6 +|ARITH +|- +|OP_1SUB +|140 +|[A] +|Subtract 1 from (non-zero) A +| +# Pop operands off the stack. +# Let B = 1, and continue as OP_SUB. +|MAX(1, length(A)) * 6 +|ARITH +|} + +====Rationale==== + +Note that the basic cost for ADD is six times the maximum operand length +(ARITH), but then considers the case where a reallocation and copy needs to +occur to append the final carry byte (COPYING, which costs 3 units per byte). + +Subtraction is cheaper because underflow does not occur: that is a validation +failure, as mathematicians agree the result would not be natural. + +===Misc Operators=== + +The following opcodes have costs below: + +{| +! Opcode +! Varops Budget Cost +! Varops Reason +|- +| OP_CHECKLOCKTIMEVERIFY +| Length of operand * 2 +| LENGTHCONV +|- +| OP_CHECKSEQUENCEVERIFY +| Length of operand * 2 +| LENGTHCONV +|- +| OP_CHECKSIGADD +| MAX(1, length(number operand)) * 9 + 500,000 +| ARITH + COPYING + SIGCHECK +|- +| OP_CHECKSIG +| 500,000 +| SIGCHECK +|- +| OP_CHECKSIGVERIFY +| 500,000 +| SIGCHECK +|} + +====Rationale==== + +OP_CHECKSIGADD does an OP_1ADD on success, so we use the same cost as that. +For simplicity, this is charged whether the OP_CHECKSIGADD succeeds or not. + +===Other Operators=== + +The varops costs of the following opcodes are defined in +[[bip-unknown-varops-budget.mediawiki|BIP-varops]]: + +* OP_VERIFY +* OP_NOT +* OP_0NOTEQUAL +* OP_EQUAL +* OP_EQUALVERIFY +* OP_2DUP +* OP_3DUP +* OP_2OVER +* OP_IFDUP +* OP_DUP +* OP_OVER +* OP_PICK +* OP_TUCK +* OP_ROLL +* OP_BOOLOR +* OP_NUMEQUAL +* OP_NUMEQUALVERIFY +* OP_NUMNOTEQUAL +* OP_LESSTHAN +* OP_GREATERTHAN +* OP_LESSTHANOREQUAL +* OP_GREATERTHANOREQUAL +* OP_MIN +* OP_MAX +* OP_WITHIN +* OP_SHA256 +* OP_HASH160 +* OP_HASH256 + +Any opcodes not mentioned in this document or the preceding list have a cost +of 0 (they do not operate on variable-length stack objects). + +==Backwards compatibility== + +This BIP defines a previously unused (and thus, always-successful) tapscript +version, for backwards compatibility. + +==Reference Implementation== + +Work in progress: + + https://github.com/jmoik/bitcoin/tree/gsr + +==Changelog== + +* 0.2.0: 2025-02-21: change costs to match those in varops budget +* 0.1.0: 2025-09-27: first public posting + +==Thanks== + +This BIP would not exist without the thoughtful contributions of coders who +considered all the facets carefully and thoroughly, and also my inspirational +wife Alex and my kids who have been tirelessly supportive of my +esoteric-seeming endeavors such as this! + +In alphabetical order: +* Anthony Towns +* Brandon Black (aka Reardencode) +* John Light +* Jonas Nick +* Mark "Murch" Erhardt +* Rijndael (aka rot13maxi) +* Steven Roose +* FIXME: your name here! + +==Appendix A: Cost Model Calculations for Multiply and Divide== + +Multiplication and division require multiple passes over the operands, meaning +a cost proportional to the square of the lengths involved, and the word size +used for that iteration makes a difference. We assume 8 bytes (64 bits) at a +time are evaluated, and the ability to multiply two 64-bit numbers and receive +a 128-bit result, and divide a 128-bit number by a 64-bit number to receive a +128-bit quotient and remainder. This is true on modern 64-bit CPUs (sometimes +using multiple instructions). + +===Multiplication Cost=== + +For multiplication, the steps break down like so: +# Allocate and zero the result: cost = (length(A) + length(B)) * 2 (ZEROING) +# For each word in A: +#* Multiply by each word in B, into a scratch vector: cost = 6 * length(B) (ARITH) +#* Sum scratch vector at the word offset into the result: cost = 6 * length(B) (ARITH) + +Note: we do not assume Karatsuba, Toom-Cook or other optimizations. + +The theoretical cost is: (length(A) + length(B)) * 2 + (length(A) + 7) / 8 * length(B) * 12. + +However, benchmarking reveals that the inner loop overhead (branch +misprediction, cache effects on small elements) is undercosted by the +theoretical model. A 2.25× multiplier on the quadratic term accounts for +this, giving a cost of: (length(A) + length(B)) * 3 + (length(A) + 7) / 8 * +length(B) * 27. + +This is slightly asymmetric: in practice an implementation usually finds that +CPU pipelining means choosing B as the larger operand is optimal. + +===Division Cost=== + +For division, the steps break down like so: + +# Bit shift both operands to set top bit of B (OP_UPSHIFT, without overflow for B): cost = length(A) * 6 + length(B) * 4 + +# Trim trailing bytes. This costs according to the number of byte removed, but since that is subtractive on future costs, we ignore it. + +# If B is longer, the answer is 0 already. So assume A is longer from now on (or equal length). + +# Compare: cost = length(A) * 2 (COMPARING) + +# Subtract: cost = length(A) * 6 (ARITH) + +# for (length(A) - NormalizedLength(B)) in words: +## Multiply word by B -> scratch: cost = NormalizedLength(B) * 6 (ARITH) +## Subtract scratch from A: cost = length(A) * 6 (ARITH) +## Add B into A (no overflow): cost = length(A) * 6 (ARITH) +## Shrink A by 1 word. + +# OP_MOD: shift A down, trim trailing zeroes: cost = length(A) * 4 + +# OP_DIV: trim trailing zeros: cost = length(A) * 4 + +Note that the loop at step 6 shrinks A every time, so the *average* cost of +each iteration is (NormalizedLength(B) * 6 + length(A) * 12) / 2. The cost of +step 6 is: + + (length(A) - NormalizedLength(B)) / 8 * (NormalizedLength(B) * 6 + length(A) * 12) / 2 + +The worst case is when NormalizedLength(B) is 0: length(A) * length(A) * 2 / 3. + +The cost for all the steps is: length(A) * 18 + length(B) * 4 + length(A) * length(A) * 2 / 3. diff --git a/bip-unknown-varops-budget.mediawiki b/bip-unknown-varops-budget.mediawiki new file mode 100644 index 0000000000..bfc927b3a4 --- /dev/null +++ b/bip-unknown-varops-budget.mediawiki @@ -0,0 +1,353 @@ +
+  BIP: ?
+  Layer: Consensus (soft fork)
+  Title: Varops Budget For Script Runtime Constraint
+  Authors: Rusty Russell 
+           Julian Moik 
+  Status: Draft
+  Type: Specification
+  Assigned: ?
+  License: BSD-3-Clause
+  Discussion: https://groups.google.com/g/bitcoindev/c/GisTcPb8Jco/m/8znWcWwKAQAJ
+              https://delvingbitcoin.org/t/benchmarking-bitcoin-script-evaluation-for-the-varops-budget-great-script-restoration/2094
+  Version: 0.2.0
+
+ +==Introduction== + +===Abstract=== + +This BIP defines a "varops budget", which generalizes the "sigops budget" introduced in [[bip-0342.mediawiki|BIP342]] to non-signature operations. + +This BIP is a useful framework for other BIPs to draw upon, and provides opcode examples which are always less restrictive than current rules. + +===Copyright=== + +This document is licensed under the 3-clause BSD license. + +===Motivation=== + +Since Bitcoin v0.3.1 (addressing CVE-2010-5137), Bitcoin's scripting capabilities have been significantly restricted to mitigate known vulnerabilities related to excessive computational time and memory usage. These early safeguards were necessary to prevent denial-of-service attacks and ensure the stability and reliability of the Bitcoin network. + +However, as Bitcoin usage becomes more sophisticated, these limitations are becoming more salient. New proposals often must explicitly address potential performance pitfalls by severely limiting their scope, introducing specialized caching strategies to mitigate execution costs, or using the existing sigops budget in ad-hoc ways to enforce dynamic execution limits. + +This BIP introduces a simple, systematic and explicit cost framework for evaluating script operations based on stack data interactions, using worst-case behavior as the limiting factor. Even with these pessimistic assumptions, large classes of scripts can be shown to be within budget (for all possible inputs) by static analysis. + +===A Model For Opcodes Dealing With Stack Data=== + +Without an explicit and low limit on the size of stack operands, the bottleneck for script operations is based on the time taken to process the stack data it accesses (with the exception of signature operations). The cost model uses the length of the stack inputs (or occasionally, outputs), hence the term "varops". + +* We assume that the manipulation of the stack vector itself (e.g. OP_DROP) is negligible (with the exception of OP_ROLL) +* We assume that memory allocation and deallocation overhead is negligible. +* We do not consider the cost of the script interpretation itself, which is necessarily limited by block size. +* We assume implementations use simple linear arrays/vectors of contiguous memory. +* We assume implementations use linear accesses to stack data (perhaps multiple times): random accesses would require an extension to the model. +* We assume object size is limited to the entire transaction (4,000,000 bytes, worst-case). +* Costs are based on the worst-case behavior of each opcode. + +The last two assumptions make a large difference in practice: normal usage on small, cache-hot objects is much faster than this model suggests. But an implementation which is more efficient than the versions modeled does not introduce any problems (though a future soft-fork version may want to reflect this in reduced costings): only an implementation with a significantly worse worst-case behavior would be problematic. + +==Design== + +A per-transaction integer "varops budget" is determined by multiplying the total transaction weight by the fixed factor 10,000 (chosen to make operation costs all integer values). The budget is transaction-wide (rather than per-input) to allow for cross-input introspection: a small input may reasonably access larger inputs. + +Opcodes consume budget as they are executed, based on the length (not generally the value) of their parameters as detailed below. A transaction which exceeds its budget fails to validate. + +===Derivation of Costs=== + +The costs of opcodes were determined by benchmarking on a variety of platforms. + +As each block can contain 80,000 Schnorr signature checks, we used this as a reasonable upper bound for maximally slow block processing. + +To estimate a conservative maximum runtime for each opcode, we consider scripts with two constraints: +# the script size is limited by the existing weight limit of 4,000,000 units and +# the script can only consume the varops budget of a whole block: 10,000 * 4,000,000 (40b). + +The script is assumed to execute in a single thread and acts on initial stack elements that are not included in the limits for conservatism. + +Ideally, on each platform we tested, the worst case time for each opcode would be no worse than the Schnorr signature upper bound: i.e. the block would get no slower. And while CHECKSIG can be batched and/or done in parallel, it also involves hashing, which is not taken into account here (the worst-case being significantly slower than the signature validations themselves). + +The signature cost is simply carried across from the existing [[bip-0342.mediawiki|BIP-342]] limit: 50 weight units allow you one signature. Since each transaction gets varops budget for the entire transaction (not just the current input's witness), and each input has at least 41 bytes (164 weight), this is actually slightly more generous than the sigops budget (which was 50 + witness weight), but still limits the entire block to 80,000 signatures. + +===Benchmarks=== + +The costs were validated by constructing maximally expensive scripts for every opcode (filling a full block with worst-case operands) and measuring wall-clock execution time across 14 machines spanning x86_64, ARM64, Linux, macOS, and Windows: + +Apple M1 Pro, Apple M2, Apple M4 Pro (macOS + Linux/Docker), AMD Ryzen 5 3600, AMD Ryzen 7 5800U, AMD Ryzen 9 9950X, Intel i5-12500, Intel i7-7700, Intel i7-8700, Intel i9-9900K, Intel N150 (Umbrel), Raspberry Pi 5. + +For each machine, the ratio of the GSR worst-case block time to the existing (pre-GSR) worst-case block time was computed. A ratio below 1.0 means GSR does not introduce a new worst case. On all 14 tested machines, the ratio is below 1.0. + +Without GSR, the slowest blocks are dominated by: +* Schnorr signature validation (80,000 signatures per block) +* Repeated hashing of 520-byte elements (3DUP + HASH256, 3DUP + RIPEMD160) + +With GSR enabled, the slowest blocks restricted by the varops budget depend on the machine but usually include: +* Schnorr signature validation (80,000 signatures per block) +* Repeated hashing of 520-byte elements (3DUP + HASH256, 3DUP + RIPEMD160) +* Small-element multiplication (MUL on 1-byte operands) +* Large-element left-shift with copying (2DUP + LSHIFT on 10KB elements) +* Hashing of 1KB elements (HASH256) +* Stack rolling at maximum depth (ROLL) +* Large-element copying (TUCK, CAT on 100KB–2MB elements) + +The raw benchmark data, analysis scripts, and visualized results are available at https://github.com/jmoik/varopsData. This repository also provides instructions for running the benchmarks on your own machine. + + +===Cost Categories=== + +We divided operations into six speed categories: + +# Signature operations. +# Hashing operations. +# OP_ROLL, which does a large-scale stack movement. +# Fast operations: comparing bytes, comparing bytes against zero, and zeroing bytes. Empirically, these have been shown to be well-optimized. +# Copying bytes: slightly more expensive than fast operations due to memory allocation overhead. +# Everything else. + +Each class then has the following costs. + +# Signature operations cost 500,000 (10,000 * 50) units each, this resembles the cost of the existing sigops budget. +# Hashing costs 50 units per byte hashed. +# OP_ROLL costs an additional 48 units (24 bytes per std::vector * 2 units per byte) per stack entry moved (i.e. the value of its operand). +# Fast operations cost 2 units per byte output. +# Copying operations cost 3 units per byte output. +# Arithmetic operations (which don't pipeline as well due to the overflow between words) cost 6 units per byte output. +# Other operations cost 4 units per byte output. + +===Variable Opcode Budget=== + +We use the following annotations to indicate the derivation for each opcode: + +;COMPARING +: Comparing two objects: cost = 2 per byte compared. +;COMPARINGZERO +: Comparing an object against zeroes: cost = 2 per byte compared. +;ZEROING +: Zeroing out bytes: cost = 2 per byte zeroed. +;COPYING +: Copying bytes: cost = 3 per byte copied. +;LENGTHCONV +: Converting an operand to a length value, including verifying that trailing bytes are zero: cost = 2 per byte examined. +;ARITH +: Arithmetic operations which have carry operations: cost = 6 per byte examined. +;SIGCHECK +: Checking a signature is a flat cost: cost = 500,000. +;HASH +: cost = 50 per byte hashed. +;ROLL +: cost = 48 per stack element moved. +;OTHER +: all other operations which take a variable-length parameter: cost = 4 per byte written. + +Note that COMPARINGZERO is a subset of COMPARING: an implementation must examine every byte of a stack element to determine if the value is 0. This can be done efficiently using existing comparison techniques, e.g. check the first byte, then `memcmp(first, first+1, len-1)`. + +Note that LENGTHCONV is used where script interprets a value as a length. Without explicit limits on number size, such (little-endian) values might have to be examined in their entirety to ensure any trailing bytes are zero, implying a COMPARINGZERO operation after the first few bytes. + +The top of stack is labeled A, with successive values B, C, etc. + +==Example Opcodes== + +The following opcodes demonstrate the approach, with an analysis of how the costs apply: + +===Example: Control And Simple Examination Opcodes=== + +{| +! Opcode +! Varops Budget Cost +! Reason +|- +|OP_VERIFY +|length(A) * 2 +|COMPARINGZERO +|- +|OP_NOT +|length(A) * 2 +|COMPARINGZERO +|- +|OP_0NOTEQUAL +|length(A) * 2 +|COMPARINGZERO +|- +|OP_EQUAL +|If length(A) != length(B): 0, otherwise length(A) * 2 +|COMPARING +|- +|OP_EQUALVERIFY +|If length(A) != length(B): 0, otherwise length(A) * 2 +|COMPARING +|} + +====Rationale==== + +OP_IF and OP_NOTIF in Tapscript require minimal values, so do not take variable length parameters, hence are not considered here. + +OP_EQUAL and OP_EQUALVERIFY don't have to examine any data (and the Bitcoin Core implementation does not) if the lengths are different. + +===Example: Stack Manipulation=== + +{| +! Opcode +! Varops Budget Cost +! Reason +|- +|OP_2DUP +|(length(A) + length(B)) * 3 +|COPYING +|- +|OP_3DUP +|(length(A) + length(B) + length(C)) * 3 +|COPYING +|- +|OP_2OVER +|(length(C) + length(D)) * 3 +|COPYING +|- +|OP_IFDUP +|length(A) * 5 +|COMPARINGZERO + COPYING +|- +|OP_DUP +|length(A) * 3 +|COPYING +|- +|OP_OVER +|length(B) * 3 +|COPYING +|- +|OP_PICK +|length(A) * 2 + length(A-th-from-top) * 3 +|LENGTHCONV + COPYING +|- +|OP_TUCK +|length(A) * 3 +|COPYING +|- +|OP_ROLL +|length(A) * 2 + 48 * Value of A +|LENGTHCONV + ROLL +|- +|} + +====Rationale==== + +These operators copy a stack entry and write to another. OP_IFDUP has the same worst-case cost as OP_IF + OP_DUP. + +OP_ROLL needs to read its operand, then move that many elements on the stack. It is the only opcode for which the stack manipulation cost is variable (and, regretfully, non-trivial), so we need to limit it. + +A reasonable implementation (and the current bitcoind C++ implementation) is to use 24 bytes for each stack element (a pointer, a size and a maximum capacity), and this value works reasonably in practice. + +===Example: Comparison Operators=== + +{| +! Opcode +! Varops Budget Cost +|- +|OP_BOOLAND +|(length(A) + length(B)) * 2 +|COMPARINGZERO +|- +|OP_BOOLOR +|(length(A) + length(B)) * 2 +|COMPARINGZERO +|- +|OP_NUMEQUAL +|MAX(length(A), length(B)) * 2 +|COMPARING + COMPARINGZERO +|- +|OP_NUMEQUALVERIFY +|MAX(length(A), length(B)) * 2 +|COMPARING + COMPARINGZERO +|- +|OP_NUMNOTEQUAL +|MAX(length(A), length(B)) * 2 +|COMPARING + COMPARINGZERO +|- +|OP_LESSTHAN +|MAX(length(A), length(B)) * 2 +|COMPARING + COMPARINGZERO +|- +|OP_GREATERTHAN +|MAX(length(A), length(B)) * 2 +|COMPARING + COMPARINGZERO +|- +|OP_LESSTHANOREQUAL +|MAX(length(A), length(B)) * 2 +|COMPARING + COMPARINGZERO +|- +|OP_GREATERTHANOREQUAL +|MAX(length(A), length(B)) * 2 +|COMPARING + COMPARINGZERO +|- +|OP_MIN +|MAX(length(A), length(B)) * 4 +|OTHER +|- +|OP_MAX +|MAX(length(A), length(B)) * 4 +|OTHER +|- +|OP_WITHIN +|(MAX(length(C), length(B)) + MAX(length(C), length(A))) * 2 +|COMPARING + COMPARINGZERO +|} + +====Rationale==== + +Numerical comparison in little-endian numbers involves a byte-by-byte comparison, then if one is longer, checking that the remainder is all zero bytes. + +However, OP_MAX and OP_MIN also normalize their result, which means they can't use the optimized comparison routine but must instead track the final non-zero byte to perform truncation. + +===Example: Hash Operators=== + +{| +! Opcode +! Varops Budget Cost +| Reason +|- +|OP_SHA256 +|(Length of the operand) * 50 +|HASH +|- +|OP_HASH160 +|(Length of the operand) * 50 +|HASH +|- +|OP_HASH256 +|(Length of the operand) * 50 +|HASH +|} + +====Rationale==== + +SHA256 has been well-optimized for current hardware, as it is already critical to Bitcoin's operation. Additional once-off steps such as the final SHA round, and RIPEMD or a second SHA256 are not proportional to the input, so are not included in the cost model. + +A model for other hash operations (OP_SHA1, OP_RIPEMD160) is possible, but we have not done so. They are not generally optimized, and if they were permitted on large inputs, this would have to be done. + +==Reference Implementation== + +Work in progress: + + https://github.com/jmoik/bitcoin/tree/gsr + +==Changelog== + +* 0.2.0: 2026-02-21: increase in cost for hashing and copying based on benchmark results. +* 0.1.0: 2025-09-27: first public posting + +==Thanks== + +This BIP would not exist without the thoughtful contributions of coders who considered all the facets carefully and thoroughly, and also my inspirational wife Alex and my kids who have been tirelessly supportive of my esoteric-seeming endeavors such as this! + +In alphabetical order: +* Anthony Towns +* Brandon Black (aka Reardencode) +* John Light +* Jonas Nick +* Mark "Murch" Erhardt +* Rijndael (aka rot13maxi) +* Steven Roose +* FIXME: your name here! + +== Footnotes == + + From 32035058b4c746523e7e8dd85190f7fd5983b121 Mon Sep 17 00:00:00 2001 From: Rusty Russell Date: Sun, 29 Mar 2026 14:33:12 +1030 Subject: [PATCH 2/3] script restoration: fix MUL cost to account to round up B to word boundary. Julian points out that the implementation does this, which improves accuracy for the case of small B (since the term is multiplied: for normal OP_ADD etc we don't bother, since the difference is very bounded). Signed-off-by: Rusty Russell --- bip-unknown-script-restoration.mediawiki | 15 ++++++++++----- 1 file changed, 10 insertions(+), 5 deletions(-) diff --git a/bip-unknown-script-restoration.mediawiki b/bip-unknown-script-restoration.mediawiki index d904678819..bb3b0dfc0e 100644 --- a/bip-unknown-script-restoration.mediawiki +++ b/bip-unknown-script-restoration.mediawiki @@ -9,7 +9,7 @@ Assigned: ? License: BSD-3-Clause Discussion: https://groups.google.com/g/bitcoindev/c/GisTcPb8Jco/m/8znWcWwKAQAJ - Version: 0.1.0 + Version: 0.2.1 Requires: Varops BIP @@ -624,6 +624,7 @@ Work in progress: ==Changelog== +* 0.2.1: 2023-03-27: fix OP_MUL cost to round length(B) up * 0.2.0: 2025-02-21: change costs to match those in varops budget * 0.1.0: 2025-09-27: first public posting @@ -659,18 +660,22 @@ using multiple instructions). For multiplication, the steps break down like so: # Allocate and zero the result: cost = (length(A) + length(B)) * 2 (ZEROING) # For each word in A: -#* Multiply by each word in B, into a scratch vector: cost = 6 * length(B) (ARITH) -#* Sum scratch vector at the word offset into the result: cost = 6 * length(B) (ARITH) +#* Multiply by each word in B, into a scratch vector: cost = 6 * ((length(B) + 7) / 8) * 8 (ARITH) +#* Sum scratch vector at the word offset into the result: cost = 6 * ((length(B) + 7) / 8) * 8 (ARITH) + +We increase the length of B here to the next word boundary, using +"((length(B) + 7) / 8) * 8", as the multiplication below makes the +difference of that from the simple "length(B)" significant. Note: we do not assume Karatsuba, Toom-Cook or other optimizations. -The theoretical cost is: (length(A) + length(B)) * 2 + (length(A) + 7) / 8 * length(B) * 12. +The theoretical cost is: (length(A) + length(B)) * 2 + (length(A) + 7) / 8 * ((length(B) + 7) / 8) * 8 * 12. However, benchmarking reveals that the inner loop overhead (branch misprediction, cache effects on small elements) is undercosted by the theoretical model. A 2.25× multiplier on the quadratic term accounts for this, giving a cost of: (length(A) + length(B)) * 3 + (length(A) + 7) / 8 * -length(B) * 27. +((length(B) + 7) / 8) * 8 * 27. This is slightly asymmetric: in practice an implementation usually finds that CPU pipelining means choosing B as the larger operand is optimal. From 78e7562de35093dc8f53f5c73e9a2f5e2cf9f24f Mon Sep 17 00:00:00 2001 From: Rusty Russell Date: Sun, 29 Mar 2026 14:33:12 +1030 Subject: [PATCH 3/3] BIP 440, 441: official numbers, into README.mediawiki and renamed. Signed-off-by: Rusty Russell --- README.mediawiki | 14 ++++++++++++++ ...n-varops-budget.mediawiki => bip-0440.mediawiki | 4 ++-- ...ipt-restoration.mediawiki => bip-0441.mediawiki | 12 ++++++------ 3 files changed, 22 insertions(+), 8 deletions(-) rename bip-unknown-varops-budget.mediawiki => bip-0440.mediawiki (99%) rename bip-unknown-script-restoration.mediawiki => bip-0441.mediawiki (98%) diff --git a/README.mediawiki b/README.mediawiki index 55580ee0aa..5ac15ab4c3 100644 --- a/README.mediawiki +++ b/README.mediawiki @@ -1409,6 +1409,20 @@ users (see also: [https://en.bitcoin.it/wiki/Economic_majority economic majority | Specification | Draft |- +| [[bip-0440.mediawiki|440]] +| Consensus (soft fork) +| Varops Budget For Script Runtime Constraint +| Rusty Russell, Julian Moik +| Specification +| Draft +|- +| [[bip-0441.mediawiki|441]] +| Consensus (soft fork) +| Restoration of disabled script (Tapleaf 0xC2) +| Rusty Russell, Julian Moik +| Specification +| Draft +|- | [[bip-0442.md|442]] | Consensus (soft fork) | OP_PAIRCOMMIT diff --git a/bip-unknown-varops-budget.mediawiki b/bip-0440.mediawiki similarity index 99% rename from bip-unknown-varops-budget.mediawiki rename to bip-0440.mediawiki index bfc927b3a4..409b5a0f47 100644 --- a/bip-unknown-varops-budget.mediawiki +++ b/bip-0440.mediawiki @@ -1,12 +1,12 @@
-  BIP: ?
+  BIP: 440
   Layer: Consensus (soft fork)
   Title: Varops Budget For Script Runtime Constraint
   Authors: Rusty Russell 
            Julian Moik 
   Status: Draft
   Type: Specification
-  Assigned: ?
+  Assigned: 2026-03-25
   License: BSD-3-Clause
   Discussion: https://groups.google.com/g/bitcoindev/c/GisTcPb8Jco/m/8znWcWwKAQAJ
               https://delvingbitcoin.org/t/benchmarking-bitcoin-script-evaluation-for-the-varops-budget-great-script-restoration/2094
diff --git a/bip-unknown-script-restoration.mediawiki b/bip-0441.mediawiki
similarity index 98%
rename from bip-unknown-script-restoration.mediawiki
rename to bip-0441.mediawiki
index bb3b0dfc0e..f658bcefdd 100644
--- a/bip-unknown-script-restoration.mediawiki
+++ b/bip-0441.mediawiki
@@ -1,16 +1,16 @@
 
-  BIP: ?
+  BIP: 441
   Layer: Consensus (soft fork)
   Title: Restoration of disabled script (Tapleaf 0xC2)
   Authors: Rusty Russell 
            Julian Moik 
   Status: Draft
   Type: Specification
-  Assigned: ?
+  Assigned: 2026-03-25
   License: BSD-3-Clause
   Discussion: https://groups.google.com/g/bitcoindev/c/GisTcPb8Jco/m/8znWcWwKAQAJ
   Version: 0.2.1
-  Requires: Varops BIP
+  Requires: 440
 
==Introduction== @@ -19,7 +19,7 @@ This BIP introduces a new tapleaf version (0xc2) which restores Bitcoin script to its pre-0.3.1 capability, relying on the Varops Budget in -[[bip-unknown-varops-budget.mediawiki|BIP-varops]] to prevent the excessive +[[bip-0440.mediawiki|BIP440]] to prevent the excessive computational time which caused CVE-2010-5137. In particular, this BIP: @@ -184,7 +184,7 @@ fail validation. These are popped off the stack in right-to-left order, i.e. [A B] means pop B off the stack, then pop A off the stack. -See [[bip-unknown-varops-budget.mediawiki|BIP-varops]] for the meaning of the +See [[bip-0440.mediawiki|BIP440]] for the meaning of the annotations in the varops cost field. ====Splice Opcodes==== @@ -577,7 +577,7 @@ For simplicity, this is charged whether the OP_CHECKSIGADD succeeds or not. ===Other Operators=== The varops costs of the following opcodes are defined in -[[bip-unknown-varops-budget.mediawiki|BIP-varops]]: +[[bip-0440.mediawiki|BIP440]]: * OP_VERIFY * OP_NOT