Skip to content

restore: http head resolve (full and incr)#9050

Open
jvarela-jump wants to merge 1 commit intomainfrom
jvarela/snapshot-incr-resolve
Open

restore: http head resolve (full and incr)#9050
jvarela-jump wants to merge 1 commit intomainfrom
jvarela/snapshot-incr-resolve

Conversation

@jvarela-jump
Copy link
Copy Markdown
Contributor

@jvarela-jump jvarela-jump commented Mar 26, 2026

There are two parts in this PR:

  1. Peers that are geographically far have a tendency to observe stale incremental snapshot information via gossip (regarding other http peers): by the time an incremental snapshot for the given full_slot and advertised incr_slot is requested, the incremental snapshot may not be available (and the snapshot load pipeline blacklists the peer). To make the http get more robust against gossip, this PR introduces fd_sshead. Then the snapct tile performs an http head against the best peer, validates the incr_slot that is being actually served, and (if valid) adjusts the incremental snapshot name before proceeding with the download request.
    As an example of the observed behavior:
WARNING .../fd_snapct_tile.c(803)[after_credit]: incremental pre-resolve: peer ...:... serves incr_slot=409016727 but gossip advertised incr_slot=409016427

Closes #8956.

  1. Also includes restore: full snapshot resolve #9088 as a second commit, extending http head resolve to the full snapshot (to filter gossip peers that advertise an RPC address but does not respond to http requests.

Closes #9091.

Copilot AI review requested due to automatic review settings March 26, 2026 16:01
@github-actions
Copy link
Copy Markdown

Performance Measurements ⏳

Suite Baseline New Change
backtest mainnet-406545575-perf per slot 0.122317 s 0.121956 s -0.295%
backtest mainnet-406545575-perf snapshot load 3.277 s 2.835 s -13.488%
backtest mainnet-406545575-perf total elapsed 122.317124 s 121.956199 s -0.295%
firedancer mem usage with mainnet.toml 1096.43 GiB 1096.43 GiB 0.000%

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR improves robustness of incremental snapshot downloading in snapct by validating a peer’s actually-served incremental snapshot (via an HTTP HEAD) before attempting the GET, avoiding failures caused by stale gossip advertisements.

Changes:

  • Add fd_sshead, a non-blocking lifecycle wrapper around fd_ssresolve for plain-HTTP HEAD pre-resolves, plus a comprehensive unit test.
  • Update fd_ssresolve request rendering and response handling (Host header includes port; improved redirect/zstd validation; better handling of full response buffers and disconnects).
  • Integrate incremental snapshot HEAD pre-resolve into fd_snapct_tile to confirm/adjust incr_slot (skip for configured HTTPS peers where periodic HEAD resolves already occur).

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
src/discof/restore/utils/test_sshead.c New unit test coverage for fd_sshead behaviors (success, redirects, malformed responses, timeout, disconnect, buffer-full, etc.).
src/discof/restore/utils/fd_ssresolve.c Improves HTTP request Host header, redirect parsing diagnostics, and response read edge-case handling.
src/discof/restore/utils/fd_sshttp_private.h Increases internal sshttp deadline constant to 2 seconds.
src/discof/restore/utils/fd_sshead.h New public API for fd_sshead (start/advance/cancel/active + shmem lifecycle).
src/discof/restore/utils/fd_sshead.c Implements non-blocking HEAD pre-resolve wrapper around fd_ssresolve using poll/connect/timeout/cleanup.
src/discof/restore/fd_snapct_tile.c Adds incremental pre-resolve state handling to verify served incr_slot before download and avoid blacklisting due to stale gossip.
src/discof/restore/Local.mk Wires fd_sshead into the build and adds/runs the new test_sshead unit test.

Comment thread src/discof/restore/utils/test_sshead.c Outdated
@jvarela-jump jvarela-jump force-pushed the jvarela/snapshot-incr-resolve branch from 2909dde to e2ee7f1 Compare March 26, 2026 16:21
@github-actions
Copy link
Copy Markdown

Performance Measurements ⏳

Suite Baseline New Change
backtest mainnet-406545575-perf per slot 0.143482 s 0.143844 s 0.252%
backtest mainnet-406545575-perf snapshot load 5.178 s 3.571 s -31.035%
backtest mainnet-406545575-perf total elapsed 143.481546 s 143.844008 s 0.253%
firedancer mem usage with mainnet.toml 1096.43 GiB 1096.43 GiB 0.000%

@jvarela-jump jvarela-jump force-pushed the jvarela/snapshot-incr-resolve branch from e2ee7f1 to 7c61ac4 Compare March 30, 2026 21:34
Copilot AI review requested due to automatic review settings March 30, 2026 21:34
@jvarela-jump jvarela-jump force-pushed the jvarela/snapshot-incr-resolve branch from 7c61ac4 to 87488ef Compare March 30, 2026 21:36
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 3 comments.

Comment thread src/discof/restore/fd_snapct_tile.c Outdated
Comment thread src/discof/restore/utils/fd_sshead.c Outdated
Comment thread src/discof/restore/fd_snapct_tile.c Outdated
@github-actions
Copy link
Copy Markdown

Performance Measurements ⏳

Suite Baseline New Change
backtest mainnet-406545575-perf per slot 0.122188 s 0.122235 s 0.038%
backtest mainnet-406545575-perf snapshot load 3.375 s 2.697 s -20.089%
backtest mainnet-406545575-perf total elapsed 122.188402 s 122.23527 s 0.038%
firedancer mem usage with mainnet.toml 1090.43 GiB 1090.43 GiB 0.000%

@jvarela-jump jvarela-jump force-pushed the jvarela/snapshot-incr-resolve branch from 87488ef to aad8719 Compare March 30, 2026 22:03
@github-actions
Copy link
Copy Markdown

Performance Measurements ⏳

Suite Baseline New Change
backtest mainnet-406545575-perf per slot 0.122278 s 0.122027 s -0.205%
backtest mainnet-406545575-perf snapshot load 3.258 s 2.737 s -15.991%
backtest mainnet-406545575-perf total elapsed 122.278066 s 122.026989 s -0.205%
firedancer mem usage with mainnet.toml 1090.43 GiB 1090.43 GiB 0.000%

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.

Comment thread src/discof/restore/fd_snapct_tile.c Outdated
Comment thread src/discof/restore/fd_snapct_tile.c Outdated
@jvarela-jump jvarela-jump force-pushed the jvarela/snapshot-incr-resolve branch from d288db7 to e52959c Compare April 6, 2026 13:34
@jvarela-jump jvarela-jump changed the title restore: incremental resolve restore: http head resolve (full and incr) Apr 6, 2026
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 6, 2026

Performance Measurements ⏳

Suite Baseline New Change
backtest mainnet-406545575-perf per slot 0.142491 s 0.142893 s 0.282%
backtest mainnet-406545575-perf snapshot load 5.207 s 3.596 s -30.939%
backtest mainnet-406545575-perf total elapsed 142.491197 s 142.892606 s 0.282%
firedancer mem usage with mainnet.toml 1112.43 GiB 1112.43 GiB 0.000%

Copilot AI review requested due to automatic review settings April 6, 2026 14:15
@jvarela-jump jvarela-jump force-pushed the jvarela/snapshot-incr-resolve branch from e52959c to 0f52524 Compare April 6, 2026 14:15
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated 1 comment.

Comment thread src/discof/restore/fd_snapct_tile.c Outdated
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 6, 2026

Performance Measurements ⏳

Suite Baseline New Change
backtest mainnet-406545575-perf per slot 0.136303 s 0.137036 s 0.538%
backtest mainnet-406545575-perf snapshot load 3.836 s 4.102 s 6.934% ⚠️
backtest mainnet-406545575-perf total elapsed 136.303235 s 137.036035 s 0.538%
firedancer mem usage with mainnet.toml 1112.43 GiB 1112.43 GiB 0.000%

@jvarela-jump jvarela-jump force-pushed the jvarela/snapshot-incr-resolve branch from ca3104a to 7f39e18 Compare April 14, 2026 21:15
@github-actions
Copy link
Copy Markdown

Performance Measurements ⏳

Suite Baseline New Change
backtest mainnet-406545575-perf per slot 0.13271 s 0.132867 s 0.118%
backtest mainnet-406545575-perf snapshot load 5.124 s 3.519 s -31.323%
backtest mainnet-406545575-perf total elapsed 132.710328 s 132.86676 s 0.118%
firedancer mem usage with mainnet.toml 1152.49 GiB 1152.49 GiB 0.000%

Copilot AI review requested due to automatic review settings April 17, 2026 17:08
@jvarela-jump jvarela-jump force-pushed the jvarela/snapshot-incr-resolve branch from 7f39e18 to 21ab4ae Compare April 17, 2026 17:08
@github-actions
Copy link
Copy Markdown

Performance Measurements ⏳

Suite Baseline New Change
backtest mainnet-406545575-perf per slot 0.105092 s 0.105261 s 0.161%
backtest mainnet-406545575-perf snapshot load 3.22 s 3.279 s 1.832%
backtest mainnet-406545575-perf total elapsed 105.091503 s 105.260978 s 0.161%
firedancer mem usage with mainnet.toml 1153.49 GiB 1153.49 GiB 0.000%

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.

Comment thread src/discof/restore/fd_snapct_tile.c
Comment thread src/discof/restore/fd_snapct_tile.c
@jvarela-jump jvarela-jump force-pushed the jvarela/snapshot-incr-resolve branch from 21ab4ae to d005cdd Compare April 18, 2026 22:24
@github-actions
Copy link
Copy Markdown

Performance Measurements ⏳

Suite Baseline New Change
backtest mainnet-406545575-perf per slot 0.104989 s 0.105004 s 0.014%
backtest mainnet-406545575-perf snapshot load 3.191 s 3.22 s 0.909%
backtest mainnet-406545575-perf total elapsed 104.988989 s 105.004326 s 0.015%
firedancer mem usage with mainnet.toml 1153.49 GiB 1153.49 GiB 0.000%

Copilot AI review requested due to automatic review settings April 20, 2026 14:11
@jvarela-jump jvarela-jump force-pushed the jvarela/snapshot-incr-resolve branch from d005cdd to ea91c78 Compare April 20, 2026 14:11
@github-actions
Copy link
Copy Markdown

Performance Measurements ⏳

Suite Baseline New Change
backtest mainnet-406545575-perf per slot 0.104695 s 0.104962 s 0.255%
backtest mainnet-406545575-perf snapshot load 3.237 s 3.409 s 5.314% ⚠️
backtest mainnet-406545575-perf total elapsed 104.694838 s 104.961522 s 0.255%
firedancer mem usage with mainnet.toml 1153.49 GiB 1153.49 GiB 0.000%

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated 1 comment.

Comment thread src/discof/restore/utils/fd_sshead.c Outdated
@github-actions
Copy link
Copy Markdown

Performance Measurements ⏳

Suite Baseline New Change
backtest mainnet-406545575-perf per slot 0.104874 s 0.10486 s -0.013%
backtest mainnet-406545575-perf snapshot load 3.24 s 3.285 s 1.389%
backtest mainnet-406545575-perf total elapsed 104.873954 s 104.860466 s -0.013%
firedancer mem usage with mainnet.toml 1153.49 GiB 1153.49 GiB 0.000%

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.

Comment thread src/discof/restore/utils/fd_ssresolve.c
Comment thread src/discof/restore/utils/fd_ssresolve.c
@github-actions
Copy link
Copy Markdown

Performance Measurements ⏳

Suite Baseline New Change
backtest mainnet-406545575-perf per slot 0.113039 s 0.112859 s -0.159%
backtest mainnet-406545575-perf snapshot load 3.502 s 3.597 s 2.713%
backtest mainnet-406545575-perf total elapsed 113.038694 s 112.859374 s -0.159%
firedancer mem usage with mainnet.toml 1153.49 GiB 1153.49 GiB 0.000%

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.

Comment thread src/discof/restore/fd_snapct_tile.c Outdated
Comment thread src/discof/restore/fd_snapct_tile.c Outdated
@github-actions
Copy link
Copy Markdown

Performance Measurements ⏳

Suite Baseline New Change
backtest mainnet-406545575-perf per slot 0.132621 s 0.132854 s 0.176%
backtest mainnet-406545575-perf snapshot load 5.239 s 5.421 s 3.474%
backtest mainnet-406545575-perf total elapsed 132.621315 s 132.854312 s 0.176%
firedancer mem usage with mainnet.toml 1153.49 GiB 1153.49 GiB 0.000%

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated no new comments.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

restore: filter gossip peers that do not serve snapshot requests restore: incremental snapshot slot verification before download request

2 participants