FIRST(balance) returns a different value depending on whether another aggregator that always evaluates its operand (e.g. LAST(balance)) is also in the SELECT list. The Assets:Checking running balance after all postings is the same, but FIRST(balance) for the last group differs.
Reproduction
cat > /tmp/example.beancount <<'BEAN'
plugin "beancount.plugins.auto_accounts"
2017-10-15 * "Repeated transaction" ^link.1
Expenses:Car 21.84 USD
Assets:Checking -21.84 USD
2017-10-15 * "Repeated transaction" ^link.1
Expenses:Car 21.84 USD
Assets:Checking -21.84 USD
2018-07-23 * "Later transaction"
Expenses:Misc 10.00 USD
Assets:Checking -10.00 USD
BEAN
bean-query /tmp/example.beancount \
"SELECT year, month, FIRST(balance) WHERE account ~ '^Assets' GROUP BY year, month"
bean-query /tmp/example.beancount \
"SELECT year, month, FIRST(balance), LAST(balance) WHERE account ~ '^Assets' GROUP BY year, month"
First query:
year mo FIRST(bala
---- -- ----------
2017 10 -21.84 USD
2018 7 -31.84 USD
Second query (same data, FIRST result changes):
year mo FIRST(bala LAST(balan
---- -- ---------- ----------
2017 10 -21.84 USD -43.68 USD
2018 7 -53.68 USD -53.68 USD
The cumulative running balance after all three postings is -21.84 + -21.84 + -10.00 = -53.68. The first query reports -31.84 for July 2018, missing the second 2017-10 posting. The second query reports the correct -53.68.
Cause
balance in beanquery/sources/beancount.py mutates a shared accumulator on every call:
@cache(maxsize=1)
def balance(context):
context.balance.add_position(context.posting)
return copy.copy(context.balance)
_PostingsTableRow.__iter__ yields the same context object for every row, so context.balance is shared state across rows.
FIRST in beanquery/query_env.py only evaluates its operand on the first row of each group:
def update(self, store, context):
if store[self.handle] is None:
value = self.operands[0](context)
store[self.handle] = value
When FIRST(balance) is the only consumer of balance, the second row of the 2017-10 group never gets balance(context) called, so its posting is not added to the accumulator. The 2018-07 group then reads a stale accumulator.
LAST evaluates its operand on every row, so adding LAST(balance) to the SELECT keeps the accumulator current. The @cache(maxsize=1) decorator means FIRST then reads the value LAST already computed for the same row.
Versions
beanquery 0.2.0, beancount 3.x, Python 3.12.
FIRST(balance)returns a different value depending on whether another aggregator that always evaluates its operand (e.g.LAST(balance)) is also in the SELECT list. The Assets:Checking running balance after all postings is the same, butFIRST(balance)for the last group differs.Reproduction
First query:
Second query (same data, FIRST result changes):
The cumulative running balance after all three postings is
-21.84 + -21.84 + -10.00 = -53.68. The first query reports-31.84for July 2018, missing the second 2017-10 posting. The second query reports the correct-53.68.Cause
balanceinbeanquery/sources/beancount.pymutates a shared accumulator on every call:_PostingsTableRow.__iter__yields the samecontextobject for every row, socontext.balanceis shared state across rows.FIRSTinbeanquery/query_env.pyonly evaluates its operand on the first row of each group:When
FIRST(balance)is the only consumer ofbalance, the second row of the 2017-10 group never getsbalance(context)called, so its posting is not added to the accumulator. The 2018-07 group then reads a stale accumulator.LASTevaluates its operand on every row, so addingLAST(balance)to the SELECT keeps the accumulator current. The@cache(maxsize=1)decorator means FIRST then reads the value LAST already computed for the same row.Versions
beanquery 0.2.0, beancount 3.x, Python 3.12.