Len fort tracker fixes#359
Conversation
- Seed cell.lastSeen in RegisterFort so the firstScan branch does not erase preloaded forts when the first GMO has partial coverage. - Merge (rather than replace) cell fort sets on firstScan and refresh lastSeen for retained forts. - Reject GMOs whose timestamp is implausibly far in the future (>60s) so a bad payload cannot permanently wedge a cell. - Dedup fort ids that appear in both pokestopIds and gymIds (gym wins). - Defensively insert forts into the new cell set during cell-move handling. - Remove tracker entry when clearGymWithLock/clearPokestopWithLock cannot load the entity, instead of leaving a dangling reference. - Drop unused UpdateFort and inline RestoreFort with explicit ms units; RestoreFort callers now pass time.Now().UnixMilli().
… logs - Collapse loadPokestopsFromDB / loadGymsFromDB into a single loadFortKindFromDB helper parameterised by table + isGym flag. - Replace clearGymWithLock / clearPokestopWithLock with a generic clearFortWithLock plus per-kind ops tables (gymClearOps, pokestopClearOps). - Hoist debug "pending removal" logs out of the global tracker lock by splitting ProcessCellUpdate into a thin wrapper and a locked core that returns the pending lists. - Downgrade per-conversion log inside applyPresentForts to Debug, since clearFortWithLock already logs the conversion at Info level. - Iterate cellForts directly in CheckRemovedForts (mapCells slice was a parallel keyset) and drop the cellsToBeCleaned slice in decodeGMO. - Use strconv.FormatUint for cellId formatting in API response builders. - Fix doc comments / type-name mismatches on FortTrackerLastSeen and FortTrackerGMOContents.
…eload as confirmation - Add forts_min_miss_count config (default 1, preserves prior behavior). When set higher, a fort must be absent from N consecutive cell-scans past the time threshold before it is flagged stale, defending against transient single-frame coverage gaps and level-30 gating (§6.2). - Track missCount per fort: incremented on every miss in ProcessCellUpdate, reset to 0 on every sighting in applyPresentForts and on the firstScan merge path. - Treat preload (Preload + LoadFortsFromDB) as a confirmation event: use time.Now().UnixMilli() for the per-fort lastSeen instead of the DB Updated column. Otherwise a long downtime would set lastSeen far in the past, causing the first partial GMO post-restart to immediately stale every preloaded fort that wasn't in the GMO. - Tests for §6.6 scenarios: - partial first GMO after preload preserves preloaded forts; - future-dated GMO does not wedge the cell; - RestoreFort + partial GMO does not re-stale; - id appearing in both pokestopIds and gymIds dedups (gym wins); - missCount guard with N=2 needs two consecutive misses; - missCount resets on intervening sighting.
|
Might need some long-term testing, cc @Fabio1988 |
|
Some parts of the description don’t make any sense |
|
I will review and slap my tool later on :) |
|
Updated. Haven't tested myself yet. |
|
Started testing |
|
So far results are far better. @Fabio1988 mind joining in testing? |
|
Reviewed on my phone but I saw: cellid as string in data structure? Why? |
It's uint64 internally, string for JSON debug output. I pulled max value of Fixed passed ts. |
Fabio1988
left a comment
There was a problem hiding this comment.
tested, works well so far
Fort tracker: POI removal correctness, hardening, cleanup
Branch:
len-fort-tracker-fixes— 4 commits.Symptom
POIs that should be marked deleted were not removed until Golbat was restarted, and often not even then. A restart only "fixed" it if the first GMO post-restart contained full coverage
of the cell — partial first GMOs left the wrong fort set in tracking.
Root cause
Two interacting bugs in
ProcessCellUpdate'sfirstScanbranch and in preload bookkeeping:cell.lastSeenwas only ever written insideProcessCellUpdate. Preload (Preload/LoadFortsFromDB/RegisterFort) never set it. After every restart, every cell hadcell.lastSeen == 0, so the next GMO took thefirstScanpath.firstScanbranch overwrotecell.pokestops/cell.gymswith the contents of that single GMO. Any preloaded fort not in that one GMO was dropped from per-cell tracking.Because stale detection iterates
cell.pokestops/cell.gyms, those forts could never be flagged stale again — they were stuck as "live" entries inft.fortspointing at a cell thatno longer listed them.
A restart only helped when the next first-GMO happened to enumerate every fort in the cell.
Other bugs found and fixed
RestoreFortunits bug. Callers passedtime.Now().Unix()(seconds) into a milliseconds field.lastSeencame out ~1000× too small; the very next GMO saw a ~1e12 ms gap, blewpast the 1h stale threshold, and re-staled the fort immediately.
RestoreFortwall-clock vs GMO timestamp drift. After fixing the units bug, callers inupdatePokestopFromFort/updateGymFromFortstill usedtime.Now().UnixMilli()insteadof the GMO's own
fort.Timestamp. On replay or ingest lag the restored fort'slastSeenran ahead of (or behind) the cell'slastSeen, which is sourced from the GMO timestamp —leaving staleness math inconsistent across the cell and its restored member.
AsOfTimeMswedge. A clock-skewed or malicious scanner could permanently silence a cell — every later real GMO hadtimestamp <= cell.lastSeenand was dropped atthe early-return.
clear*WithLockghost entries. When the gym/pokestop could not be loaded from cache or DB, the tracker entry was left in place. Same fort was re-selected as stale on everysubsequent GMO.
UpdateFort. Only caller wasRestoreFort, and the units bug rode through it.pokestopIdsandgymIdsof one GMO, it was inserted into both sets and the conversion logic flipped back and forth.lastSeenfrompokestop.Updated. Once commit 1 madecell.lastSeennon-zero post-preload (so stale detection actually ran), this became live: any fort whose DBUpdatedpredated the stale threshold was flagged stale on the first GMO that didn't include it. Co-primary with the root cause; fixed together in commit 3.Fix summary
Four commits, in order:
1.
fix(fort_tracker)—d20d5caRegisterFortseedscell.lastSeen, so the first post-restart GMO is no longer treated as afirstScan.firstScanbranch merges (does not replace) cell fort sets, refreshinglastSeenfor retained forts.pokestopIdsandgymIds(gym wins, matches proto pack order).clear*WithLockremoves the tracker entry when the entity cannot be loaded.UpdateFort.RestoreFortdoes the work directly with explicit ms units; callers now passtime.Now().UnixMilli().2.
refactor(fort_tracker)—11f8a04loadPokestopsFromDB/loadGymsFromDBinto oneloadFortKindFromDBhelper.clearGymWithLock/clearPokestopWithLockwith a genericclearFortWithLockplus per-kind ops tables (gymClearOps,pokestopClearOps).ProcessCellUpdateinto a thin wrapper plus locked core that returns the pending lists.clearFortWithLock).cellFortsdirectly inCheckRemovedForts; drop the parallelcellsToBeCleanedslice indecodeGMO.strconv.FormatUintfor cellId formatting in API responses (S2 cell ids exceed JS 2^53 safe-int — observed DB max 8935267727656878080 — so they're stringified for JSON consumers).FortTrackerLastSeenandFortTrackerGMOContents.3.
feat(fort_tracker)—26140b2cleanup.forts_min_miss_count(default1, preserves prior behaviour). When set higher, a fort must be absent from N consecutive cell scans past the time threshold beforebeing flagged stale — defends against transient single-frame coverage gaps and level-30 gating.
missCountper fort, reset on every sighting (including thefirstScanmerge path).lastSeenistime.Now().UnixMilli()instead ofrow.Updated * 1000. Required for commit 1'scell.lastSeenseeding to be safe — otherwisepreloaded forts with old
Updatedvalues get instantly staled.RestoreFortfollowed by partial GMO does not re-stale;missCountguard with N=2 needs two consecutive misses;missCountresets on intervening sighting.4.
fix(fort_tracker)— RestoreFort uses GMO timestampupdatePokestopFromFortpassesnow*1000(param already received asfort.Timestamp/1000seconds) intoRestoreFortinstead oftime.Now().UnixMilli().updateGymFromFortgains atimestampMs int64parameter;gmo_decode.gocall site passesfort.Timestamp.lastSeenand its cell'slastSeenunder ingest lag or replay.Out of scope
ft.cells/ft.forts. GMOs sometimes return empty; we don't want to drop cells whose forts vanished transiently.LoadFortsFromDBand live ingest. Loading already blocks before listeners accept traffic.Testing
go test ./...— all green.missCountsemantics.Config