proposal: A new API for the discovery of metric names, labels and values#74
Conversation
10e6a1a to
2e650d8
Compare
There was a problem hiding this comment.
I think it's a really well written proposal! Please see my suggestions though.
Also wondering: Could /search/metric_names have a match[] parameter? Users might want to filter metric names by existing label matchers (e.g., "metric names where job=prometheus").
charleskorn
left a comment
There was a problem hiding this comment.
I've focused on the first endpoint below, but I believe my comments apply to the other two endpoints as well.
Thanks @aknuds1 for your feedback and suggestions. I have updated to include this suggestion. |
a4a5507 to
73ea349
Compare
|
Thank you @charleskorn for you feedback and suggestions! |
bwplotka
left a comment
There was a problem hiding this comment.
Leaving some questions.
Mostly around how feasible it is for OSS Prometheus, Thanos, Cortex and non-Grafana vendors to implement. I think we are close, but let's double check those things.
If there are pieces that are specific to Mimir, I'd suggest adding those parameters (extra ones) for Mimir only implementation, but leave the formal spec smaller for wider audience.
saswatamcode
left a comment
There was a problem hiding this comment.
Thanks for this! Some comments/questions,
I added a section around how we could support specific extensions for different implementations - so if there is anything here which is too Mimir specific we could move it into here. See 9f43fdd |
|
Thank you @yeya24 @bwplotka @saswatamcode @GiedriusS for all your commentary. Please keep calling out areas of concern or alternative suggestions! |
roidelapluie
left a comment
There was a problem hiding this comment.
I have read the issue and the comments, added a few.
I feel like it should be decided if we still want to support pagination or not, which is one of the main blockers I see in my view. Let's clarify this.
I also wonder if we should discuss in the proposal the API's that will need to be implemented. Do we want new functions added to the Querier ? What would that look like? Or do we need a new interface?
I have reached out and asked for this clarification - stay tuned!
@roidelapluie Would you like to update the proposal with your suggestions? |
I am familiar with Prometheus who is single tenant and that could accommodate with any interface, the question would be more for the people who run distributed systems (mimir, cortex etc). |
Signed-off-by: Andrew Hall <andrew.hall@grafana.com>
Co-authored-by: Arve Knudsen <arve.knudsen@gmail.com> Signed-off-by: Andrew Hall <andrew.hall@grafana.com>
Co-authored-by: Arve Knudsen <arve.knudsen@gmail.com> Signed-off-by: Andrew Hall <andrew.hall@grafana.com>
Co-authored-by: Arve Knudsen <arve.knudsen@gmail.com> Signed-off-by: Andrew Hall <andrew.hall@grafana.com>
Co-authored-by: Arve Knudsen <arve.knudsen@gmail.com> Signed-off-by: Andrew Hall <andrew.hall@grafana.com>
Signed-off-by: Andrew Hall <andrew.hall@grafana.com>
Signed-off-by: Andrew Hall <andrew.hall@grafana.com>
Signed-off-by: Andrew Hall <andrew.hall@grafana.com>
Co-authored-by: Julien <291750+roidelapluie@users.noreply.github.com> Signed-off-by: Andrew Hall <andrew.hall@grafana.com>
This is part of the implementation of prometheus/proposals#74 Signed-off-by: Julien Pivotto <291750+roidelapluie@users.noreply.github.com>
This is part of the implementation of prometheus/proposals#74 Signed-off-by: Julien Pivotto <291750+roidelapluie@users.noreply.github.com>
This is part of the implementation of prometheus/proposals#74 Signed-off-by: Julien Pivotto <291750+roidelapluie@users.noreply.github.com>
This is part of the implementation of prometheus/proposals#74 Signed-off-by: Julien Pivotto <291750+roidelapluie@users.noreply.github.com>
This is part of the implementation of prometheus/proposals#74 Signed-off-by: Julien Pivotto <291750+roidelapluie@users.noreply.github.com>
This is part of the implementation of prometheus/proposals#74 Signed-off-by: Julien Pivotto <291750+roidelapluie@users.noreply.github.com>
This is part of the implementation of prometheus/proposals#74 Signed-off-by: Julien Pivotto <291750+roidelapluie@users.noreply.github.com>
|
I plan to merge this in the coming days if no further blocking comments are made. |
|
👍 |
This is part of the implementation of prometheus/proposals#74 Signed-off-by: Julien Pivotto <291750+roidelapluie@users.noreply.github.com>
* util/strutil: add Jaro-Winkler similarity implementation This is part of the implementation of prometheus/proposals#74 Signed-off-by: Julien Pivotto <291750+roidelapluie@users.noreply.github.com> * util/strutil: optimise JaroWinkler with string-native ASCII path Replace the generic jaroWinkler[T byte|rune] with two specialised functions: jaroWinklerString (ASCII path) operates directly on the string values and avoids the []byte conversion that previously caused two heap allocations per call; jaroWinklerRunes (Unicode path) is unchanged in algorithm but split out from the generic. Both paths replace the repeated float64 divisions in the Jaro formula with precomputed reciprocals (invL1, invL2). Result: short ASCII strings drop from 2 allocs/op to 0 allocs/op; long ASCII drops from 4 allocs/op to 2 allocs/op (bool match arrays only). Signed-off-by: Julien Pivotto <291750+roidelapluie@users.noreply.github.com> * util/strutil: replace JaroWinkler with JaroWinklerMatcher Remove the free JaroWinkler function and replace it with a JaroWinklerMatcher struct. NewJaroWinklerMatcher pre-computes the ASCII check and rune conversion for the search term once; Score then runs the comparison against each candidate without repeating that work. This is the expected usage pattern in Prometheus: one fixed term scored against many label names or values. Signed-off-by: Julien Pivotto <291750+roidelapluie@users.noreply.github.com> * Update util/strutil/jarowinkler.go and util/strutil/jarowinkler_test.go Co-authored-by: Arve Knudsen <arve.knudsen@gmail.com> Signed-off-by: Julien <291750+roidelapluie@users.noreply.github.com> Signed-off-by: Julien Pivotto <291750+roidelapluie@users.noreply.github.com> --------- Signed-off-by: Julien Pivotto <291750+roidelapluie@users.noreply.github.com> Signed-off-by: Julien <291750+roidelapluie@users.noreply.github.com> Co-authored-by: Arve Knudsen <arve.knudsen@gmail.com>
…18405) * util/strutil: add Jaro-Winkler similarity implementation This is part of the implementation of prometheus/proposals#74 Signed-off-by: Julien Pivotto <291750+roidelapluie@users.noreply.github.com> * util/strutil: optimise JaroWinkler with string-native ASCII path Replace the generic jaroWinkler[T byte|rune] with two specialised functions: jaroWinklerString (ASCII path) operates directly on the string values and avoids the []byte conversion that previously caused two heap allocations per call; jaroWinklerRunes (Unicode path) is unchanged in algorithm but split out from the generic. Both paths replace the repeated float64 divisions in the Jaro formula with precomputed reciprocals (invL1, invL2). Result: short ASCII strings drop from 2 allocs/op to 0 allocs/op; long ASCII drops from 4 allocs/op to 2 allocs/op (bool match arrays only). Signed-off-by: Julien Pivotto <291750+roidelapluie@users.noreply.github.com> * util/strutil: replace JaroWinkler with JaroWinklerMatcher Remove the free JaroWinkler function and replace it with a JaroWinklerMatcher struct. NewJaroWinklerMatcher pre-computes the ASCII check and rune conversion for the search term once; Score then runs the comparison against each candidate without repeating that work. This is the expected usage pattern in Prometheus: one fixed term scored against many label names or values. Signed-off-by: Julien Pivotto <291750+roidelapluie@users.noreply.github.com> * Update util/strutil/jarowinkler.go and util/strutil/jarowinkler_test.go Co-authored-by: Arve Knudsen <arve.knudsen@gmail.com> Signed-off-by: Julien <291750+roidelapluie@users.noreply.github.com> Signed-off-by: Julien Pivotto <291750+roidelapluie@users.noreply.github.com> --------- Signed-off-by: Julien Pivotto <291750+roidelapluie@users.noreply.github.com> Signed-off-by: Julien <291750+roidelapluie@users.noreply.github.com> Co-authored-by: Arve Knudsen <arve.knudsen@gmail.com> Signed-off-by: Raphael Bizos <r.bizos@criteo.com>
|
|
||
| ### `GET /api/v1/search/metric_names` | ||
|
|
||
| An endpoint specific to searching for metric names (\_\_name__ values) and obtaining an enriched record for each metric name. |
There was a problem hiding this comment.
The query parameters itself are very limited for fuzzy search usecase. But ideally this API can be extended to support more usecases like semantic search in the AI era.
I know what TSDB can do is kind of limited but maybe we can design the API in a way that can be extended to allow more search usecase like using an external search service like OpenSearch
#### What this PR does This is the first PR for the implementation of a new streaming labels/values search API. See prometheus/proposals#74. This PR leverages Prometheus vendored storage interface changes and search/filter utilities. For reference - noting not all of these are merged as yet. * prometheus/prometheus#18405 * prometheus/prometheus#18402 * prometheus/prometheus#18576 * prometheus/prometheus#18497 * prometheus/prometheus#18573 This PR focuses on the Searcher implementation which will run on the Ingester and Store-Gateways. The changes are at the gRPC boundary into these components. Once this PR is merged, the next PR will focus on the Querier and fan-out of the search requests to the changes made in this PR. TBD - should a change log be recorded for non-user facing change. None of the changes in this PR will be inline to any user operation at this time. #### Which issue(s) this PR fixes or relates to Fixes #<issue number> #### Checklist - [x] Tests updated. - [ ] Documentation added. - [ ] `CHANGELOG.md` updated - the order of entries should be `[CHANGE]`, `[FEATURE]`, `[ENHANCEMENT]`, `[BUGFIX]`. If changelog entry is not needed, please add the `changelog-not-needed` label to the PR. - [ ] [`about-versioning.md`](https://github.com/grafana/mimir/blob/main/docs/sources/mimir/configure/about-versioning.md) updated with experimental features. <!-- CURSOR_SUMMARY --> --- > [!NOTE] > **Medium Risk** > Introduces new gRPC streaming APIs and non-trivial merge/ordering logic on the read path; while largely additive, it touches core ingester/store-gateway query surfaces and could impact correctness or performance under load. > > **Overview** > Adds a new *streaming labels/values search API* across ingester and store-gateway, including new gRPC methods `SearchLabelNames` / `SearchLabelValues`, request/response protos (`SearchFilter`, `SearchOrdering`, `SearchResultBatch`), and regenerated client/server stubs. > > Implements the ingester-side `storage.Searcher` integration with batched streaming, input validation mapped to `InvalidArgument`, warning propagation, and context-cancellation checks; wires the new RPCs through activity tracking and profiling wrappers. > > Implements the store-gateway search path by running per-block searches concurrently, applying per-block filter/order/limit, and streaming a deduplicated k-way merge via a new `PairwiseMergeSearchSets` utility (ported from Prometheus), plus wrappers for tracing, activity tracking, and client error wrapping; adds comprehensive tests for batching, ordering, limits, warnings, errors, and cancellation. > > <sup>Reviewed by [Cursor Bugbot](https://cursor.com/bugbot) for commit 3e5862c. Bugbot is set up for automated code reviews on this repo. Configure [here](https://www.cursor.com/dashboard/bugbot).</sup> <!-- /CURSOR_SUMMARY --> --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: Arve Knudsen <arve.knudsen@gmail.com>
No description provided.