Skip to content

PostgreSQLSyncProvider.GetChanges: rollback masks exceptions; GetLastRemoteAnchor crashes on -1 sentinel #13

@davidortinau

Description

@davidortinau

Summary

Two bugs in PostgreSQLSyncProvider.GetChangesAsync cause /api/sync-agent/changes-bulk/{otherStoreId} to return HTTP 500 the first time any peer requests changes from a fresh PostgreSQL store. Symptoms are masked by a secondary exception in the catch block, so the underlying issue is hard to spot.

Reproduced against CoreSync.PostgreSQL 0.1.127 and 0.1.129 with .NET 10, Npgsql 9.x, and a SQLite peer using CoreSync.Http.Client over HTTPS. End-to-end fix verified locally — happy to send a PR if useful.

Bug 1 — tr.Rollback() on already-completed transaction masks the real exception

In GetChangesAsync (around line 396–409 of src/CoreSync.PostgreSQL/PostgreSQLSyncProvider.cs):

await tr.CommitAsync();
// ...
var anchor = await GetLastRemoteAnchorForStoreAsync(...); // <-- if this throws, we go to catch
return new SyncChanges(anchor, items);
}
catch (Exception)
{
    tr.Rollback();   // throws "This NpgsqlTransaction has completed; it is no longer usable"
    throw;
}

After CommitAsync, the transaction is completed. If anything between commit and the return throws, the catch block calls Rollback() which itself throws InvalidOperationException, and that replaces the original exception. The user only sees "This NpgsqlTransaction has completed" (or, if surfaced through ASP.NET ProblemDetails, just a generic 500 with no detail).

Suggested fix: wrap the rollback in its own try/catch and log the original exception:

catch (Exception ex)
{
    _logger?.LogError(ex, "Error while assembling SyncChanges for store {OtherStoreId}", otherStoreId);
    try { tr.Rollback(); } catch { /* tx may already be committed/aborted */ }
    throw;
}

Bug 2 — GetLastRemoteAnchorForStoreAsync hands a -1 sentinel to SyncAnchor which validates version >= 0

__core_sync_remote_anchor stores remote_version = -1 (and local_version = NULL) as a placeholder for "this peer has never received anything from us yet". Same idea for __core_sync_local_anchor.

GetLastRemoteAnchorForStoreAsync (around lines 432–447) reads that row and unconditionally constructs:

return new SyncAnchor(otherStoreId, remoteVersion);

SyncAnchor's constructor validates that version is non-negative and throws:

System.ArgumentException: Invalid version, must be not negative (Parameter 'version')
   at CoreSync.SyncAnchor..ctor(...)
   at CoreSync.PostgreSQL.PostgreSQLSyncProvider.GetLastRemoteAnchorForStoreAsync(...)

This means the very first GetChangesAsync call for any new peer fails, until that peer has successfully received at least one batch (which it can't, because this method is part of that flow). GetLastLocalAnchorForStoreAsync has the same shape.

Suggested fix: treat negative versions as SyncAnchor.Null:

if (remoteVersion < 0) return SyncAnchor.Null;
return new SyncAnchor(otherStoreId, remoteVersion);

(and similarly in GetLastLocalAnchorForStoreAsync).

Reproduction

  1. Spin up a PostgreSQL-backed CoreSync server (configured exactly per the README).
  2. Provision a fresh peer (any client) that has never synced — its row in __core_sync_remote_anchor will have remote_version = -1.
  3. From the peer, call GET /api/sync-agent/changes-bulk/{otherStoreId} (or invoke the equivalent provider method server-side).
  4. Server returns HTTP 500. Application logs only show the masked rollback exception. The real cause is ArgumentException: Invalid version, must be not negative from SyncAnchor.

After applying both fixes, the same call returns 200 with the full change set. We have it serving 17,994 changes across 360 pages to a Flutter peer with no further changes.

Local workaround

We're consuming a locally-rebuilt CoreSync.PostgreSQL.0.1.129-local2.nupkg with both fixes pinned via package source mapping in our app repo until this lands upstream.

Versions:

  • CoreSync.PostgreSQL 0.1.129 (also reproduces on 0.1.127)
  • CoreSync.Http.Client 0.1.129 client / CoreSync.Http.Server 0.1.129 server
  • .NET 10.0.101, Npgsql 9.0.x

Happy to send a PR with the patches and a regression test. Thanks for maintaining CoreSync — it's been great to work with.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions