feat: Refresh cache when busting entries#3184
feat: Refresh cache when busting entries#3184bblaszkow06 wants to merge 24 commits intoLogflare:mainfrom
Conversation
53e8ada to
cfca793
Compare
06f9fb9 to
120e425
Compare
|
just realised that this will result in 1 db query per node for each wal record propagated...perhaps we can perform the fetch before propagation, so shift the fetch up to the broadcaster |
Ziinc
left a comment
There was a problem hiding this comment.
need to shift up fetch logic from receiving node level to broadcaster level
| if function_exported?(cache_module, :fetch_by_id, 1) do | ||
| case cache_module.fetch_by_id(pkey) do | ||
| nil -> {:bust, {context, pkey}} | ||
| struct -> {:fetched, {context, pkey, struct}} | ||
| end | ||
| else | ||
| {:bust, {context, pkey}} | ||
| end |
There was a problem hiding this comment.
can be a with/do to be more compact.
| defp fetch_or_bust({:refresh, {context, keyword}}), do: {:bust, {context, keyword}} | ||
| defp fetch_or_bust({:bust, {context, key}}), do: {:bust, {context, key}} |
There was a problem hiding this comment.
| defp fetch_or_bust({:refresh, {context, keyword}}), do: {:bust, {context, keyword}} | |
| defp fetch_or_bust({:bust, {context, key}}), do: {:bust, {context, key}} | |
| defp fetch_or_bust({:refresh, {context, keyword}}) when is_list(keyword), do: {:bust, {context, keyword}} | |
| defp fetch_or_bust(other), do: other |
There was a problem hiding this comment.
I've kept last match to ensure quicker raise in case of invalid entry
| record: %{"source_id" => source_id} | ||
| }) do | ||
| {SavedSearches, [source_id: String.to_integer(source_id)]} | ||
| {:refresh, {SavedSearches, [source_id: String.to_integer(source_id)]}} |
There was a problem hiding this comment.
if we know that kw lists are ignored for :refresh then we could just do :bust?
There was a problem hiding this comment.
We could, but it's just a limitation of ContextCache - hopefully solvable in future. In context of CacheBuster refresh here seems more appropriate, but I agree it might be misleading. I'll leave the decision up to you:
| {:refresh, {SavedSearches, [source_id: String.to_integer(source_id)]}} | |
| {:bust, {SavedSearches, [source_id: String.to_integer(source_id)]}} |
| Enum.reduce(to_delete, 0, fn {k, _v}, acc -> | ||
| Cachex.del(worker, k) | ||
| acc + 1 | ||
| defp refresh_key(context, pkey, fresh_struct) when is_integer(pkey) do |
There was a problem hiding this comment.
the fresh struct might not have the appropriate preloads that certain context functions perform. we can't do a simple Repo.get and assume that context functions return the same struct. Some context functions perform preloading/virtual field calculations.
Would need to re-run the specific context function that is affected and then broadcast that for in-place replacement
There was a problem hiding this comment.
IMO this would require a significant refactor, something like what I suggested some time ago on Slack (in short: store only bare entries in cache by id; other keys would only refer to the id; preloads would load from many chaches; derivatives would be recalculated + plus probably each cache would handle it's own busting - handling that at central cache buster level seems problematic).
Anyway, even currently preloads are problematic as they are often stale (e.g. changes to the join table do not bust other entries, for 1-n and 1-1 assocs only one side is busted/refreshed).
To pass this PR without a major refactor I'm thinking I could change the fetch_by_id to return a map %{cache_key => fresh_value} but it would require knowing all the keys (so generic functions with kw as args would need to be replaced as well)
Alternatively, I could create a blacklist of keys that should not use the fetched value, resulting in busting instead of refresh. WDYT is better for closing this PR?
There was a problem hiding this comment.
the former where only bare entries in cache can lead to a complex dependency graph, and if any cache has a hiccup when performing querying then things could potentially back up. for example, one fetch triggers another fetch which triggers another fetch, that would be more brittle and add 3x roundtrips.
migrating codebase away from relying on preloads would be better.
but for now, what would be the issue with performing a refresh for a context cache's functions that accept a pkey then broadcasting those? the latter suggestion can work but it would greatly limit the benefit+utility of this feature.
There was a problem hiding this comment.
what would be the issue with performing a refresh for a context cache's functions that accept a pkey
- Not all of them accept pkey (e.g. Sources.get_by_and_preload may be called with token)
- CacheBuster cannot rely on browsing keys in the local cache, as other nodes may have some other keys pointing to an entry with a specific
:id - To determine keys/functions to call at the CacheBuster level without hardcoding a full list there, it needs to be delegated to a specific context. That's what fetch_by_id does now. It would need to fetch all the possible variants (with preloads and/or virtual fields set), but that's impossible when the cached function accepts a list of
preloads. Only browsing the codebase and seeing all calls to this function, a list of variants to fetch can be determined. This means the cached functions would have to be refactored to a "static" variant, matching what is used in the codebase.
All in all, it's not that complicated, but not very simple, and quite verbose
There was a problem hiding this comment.
To pass this PR without a major refactor I'm thinking I could change the fetch_by_id to return a map %{cache_key => fresh_value} but it would require knowing all the keys (so generic functions with kw as args would need to be replaced as well)
let's try this approach, perhaps a list of maps, and we gradually add more cache_keys to this over time where it makes sense.We can ignore kwlists, i those will be much more complex and busting for a local query on cache miss would be better.
Closes ANL-1351