Skip to content

FIRST(balance) returns wrong value depending on which other aggregators are in the SELECT #279

@robcohen

Description

@robcohen

FIRST(balance) returns a different value depending on whether another aggregator that always evaluates its operand (e.g. LAST(balance)) is also in the SELECT list. The Assets:Checking running balance after all postings is the same, but FIRST(balance) for the last group differs.

Reproduction

cat > /tmp/example.beancount <<'BEAN'
plugin "beancount.plugins.auto_accounts"

2017-10-15 * "Repeated transaction" ^link.1
  Expenses:Car   21.84 USD
  Assets:Checking  -21.84 USD

2017-10-15 * "Repeated transaction" ^link.1
  Expenses:Car   21.84 USD
  Assets:Checking  -21.84 USD

2018-07-23 * "Later transaction"
  Expenses:Misc   10.00 USD
  Assets:Checking  -10.00 USD
BEAN

bean-query /tmp/example.beancount \
  "SELECT year, month, FIRST(balance) WHERE account ~ '^Assets' GROUP BY year, month"

bean-query /tmp/example.beancount \
  "SELECT year, month, FIRST(balance), LAST(balance) WHERE account ~ '^Assets' GROUP BY year, month"

First query:

year  mo  FIRST(bala
----  --  ----------
2017  10  -21.84 USD
2018   7  -31.84 USD

Second query (same data, FIRST result changes):

year  mo  FIRST(bala  LAST(balan
----  --  ----------  ----------
2017  10  -21.84 USD  -43.68 USD
2018   7  -53.68 USD  -53.68 USD

The cumulative running balance after all three postings is -21.84 + -21.84 + -10.00 = -53.68. The first query reports -31.84 for July 2018, missing the second 2017-10 posting. The second query reports the correct -53.68.

Cause

balance in beanquery/sources/beancount.py mutates a shared accumulator on every call:

@cache(maxsize=1)
def balance(context):
    context.balance.add_position(context.posting)
    return copy.copy(context.balance)

_PostingsTableRow.__iter__ yields the same context object for every row, so context.balance is shared state across rows.

FIRST in beanquery/query_env.py only evaluates its operand on the first row of each group:

def update(self, store, context):
    if store[self.handle] is None:
        value = self.operands[0](context)
        store[self.handle] = value

When FIRST(balance) is the only consumer of balance, the second row of the 2017-10 group never gets balance(context) called, so its posting is not added to the accumulator. The 2018-07 group then reads a stale accumulator.

LAST evaluates its operand on every row, so adding LAST(balance) to the SELECT keeps the accumulator current. The @cache(maxsize=1) decorator means FIRST then reads the value LAST already computed for the same row.

Versions

beanquery 0.2.0, beancount 3.x, Python 3.12.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions