Proposal
The agent currently uses the same Checkpoint implementation as all other parts of prometheus,
|
// Checkpoint creates a compacted checkpoint of segments in range [from, to] in the given WAL. |
|
// It includes the most recent checkpoint if it exists. |
|
// All series not satisfying keep, samples/tombstones/exemplars below mint and |
|
// metadata that are not the latest are dropped. |
The checkpoint serves three purposes for agent mode,
- Populates the agent db
stripeSeries with known series + last sample timestamps on startup
- Populate series caches in queue_manager on startup
- Pruning the series caches in queue_manager after a new checkpoint is created
- Not applicable for agent mode yet and might be dropped Most recently metadata for a series
This is an incredibly small subset of the data vs what is persisted in a checkpoint which includes, series which exist in the WAL, samples above mint, float and regular histogram samples above mint, exemplars above mint, and latest metadata. In order to create a checkpoint with all these records we re-read the current checkpoint + all segments. This is a lot of overhead given all the data we require for the checkpoint is currently in memory between stripeSeries and the deleted series in agent db.
I propose we introduce another checkpoint implementation which could look something like,
type ActiveSeries interface {
Ref() chunks.HeadSeriesRef
Labels() labels.Labels
LastSampleTimestamp() int64
}
// Checkpoint creates an unindexed checkpoint containing record.RefSeries and
// record.RefSample for ActiveSeries and a record.RefSeries for the recentlyDeleted series.
func Checkpoint(logger *slog.Logger, w *WL, seriesIter iter.Seq[ActiveSeries], recentlyDeleted []chunks.HeadSeriesRef)
that could be driven by the data we currently have in memory which would,
- Reduce the overhead of taking a checkpoint
- Reduce the overhead of queue_manager reading a checkpoint as checkpoints will be smaller
- Improve startup times/resource usage due to smaller checkpoint sizes
I did a quick implementation of this in Grafana Alloy where it shrunk a 214MB checkpoint by 56% down to 137MB, with the following improvements to creating a checkpoint + loading a checkpoint
│ old-create.txt │ new-create.txt │
│ sec/op │ sec/op vs base │
Checkpoint-11 3477.6m ± 7% 913.3m ± 6% -73.74% (p=0.002 n=6)
│ old-create.txt │ new-create.txt │
│ B/op │ B/op vs base │
Checkpoint-11 2717.25Mi ± 0% 11.52Mi ± 11% -99.58% (p=0.002 n=6)
│ old-create.txt │ new-create.txt │
│ allocs/op │ allocs/op vs base │
Checkpoint-11 34087723.5 ± 0% 325.0 ± 1% -100.00% (p=0.002 n=6)
│ baseline-load.txt │ new-load.txt │
│ sec/op │ sec/op vs base │
LoadLargeWAL-11 4.195 ± 2% 1.105 ± 5% -73.67% (p=0.002 n=6)
│ baseline-load.txt │ new-load.txt │
│ B/op │ B/op vs base │
LoadLargeWAL-11 2.001Gi ± 1% 1.204Gi ± 0% -39.83% (p=0.002 n=6)
│ baseline-load.txt │ new-load.txt │
│ allocs/op │ allocs/op vs base │
LoadLargeWAL-11 35.22M ± 0% 30.76M ± 0% -12.66% (p=0.002 n=6)
Proposal
The agent currently uses the same Checkpoint implementation as all other parts of prometheus,
prometheus/tsdb/wlog/checkpoint.go
Lines 87 to 90 in 61aa828
The checkpoint serves three purposes for agent mode,
stripeSerieswith known series + last sample timestamps on startupThis is an incredibly small subset of the data vs what is persisted in a checkpoint which includes, series which exist in the WAL, samples above mint, float and regular histogram samples above mint, exemplars above mint, and latest metadata. In order to create a checkpoint with all these records we re-read the current checkpoint + all segments. This is a lot of overhead given all the data we require for the checkpoint is currently in memory between
stripeSeriesand the deleted series in agent db.I propose we introduce another checkpoint implementation which could look something like,
that could be driven by the data we currently have in memory which would,
I did a quick implementation of this in Grafana Alloy where it shrunk a 214MB checkpoint by 56% down to 137MB, with the following improvements to creating a checkpoint + loading a checkpoint