feat: Round-up file-sizes by block-size in file-cache (opt-out by experimental-file-cache-disable-size-calculation-fix)#4514
Conversation
e9f7409 to
c2ebcf2
Compare
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #4514 +/- ##
==========================================
+ Coverage 83.63% 83.65% +0.01%
==========================================
Files 163 164 +1
Lines 20150 20185 +35
==========================================
+ Hits 16853 16885 +32
- Misses 2661 2663 +2
- Partials 636 637 +1
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
/gemini review |
|
/gemini summary |
There was a problem hiding this comment.
Code Review
This pull request introduces a fix for file cache size accounting by considering the filesystem block size, enabled by default via a new experimental flag. The implementation extends the LRU cache to support custom size calculation functions and adds new utility functions to determine disk usage. The changes are well-structured. However, there is a critical bug in the LRU cache implementation when updating existing entries, which could lead to incorrect cache size management. Please see the specific comment for details.
Summary of ChangesThis pull request introduces a significant improvement to how file cache sizes are accounted for, ensuring that the reported cache usage more accurately reflects the actual disk space consumed. By rounding up cached file sizes to the nearest filesystem block size, the system can better adhere to configured cache limits. An experimental flag has been added to provide flexibility in controlling this new behavior, which is a foundational step towards more robust cache management. Highlights
Changelog
Activity
|
c2ebcf2 to
a513656
Compare
There was a problem hiding this comment.
Code Review
This pull request introduces a fix for file cache size accounting by considering the filesystem block size. It adds a new experimental flag to disable this behavior. The changes involve modifying the LRU cache to support custom size calculation functions and adding utility functions to get disk properties. My review focuses on a potential bug in the new LRU cache API, adherence to the project's code generation process, and general code quality improvements such as removing magic numbers and strengthening tests.
059e6f0 to
5d4f2ce
Compare
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request introduces a fix for file cache size accounting by rounding up file sizes to the filesystem's block size, providing a more accurate measure of disk utilization. This is controlled by a new experimental flag, experimental-file-cache-disable-size-calculation-fix, which is off by default (meaning the fix is active). The implementation involves significant and well-executed modifications to the LRU cache to support custom size calculation functions, along with new utility functions for disk properties. The changes are accompanied by thorough unit and integration tests.
My main feedback is regarding a confusing log message in internal/cache/file/cache_handler.go that incorrectly states the reason for the size calculation fix being disabled. I've provided a suggestion to correct the logic and improve clarity for users.
There was a problem hiding this comment.
Code Review
This pull request introduces a valuable fix for file cache size accounting by rounding up file sizes to the filesystem's block size, leading to more accurate cache size management. The implementation is robust, introducing a flexible sizing mechanism in the LRU cache and encapsulating disk utility functions cleanly. The changes are well-supported by comprehensive unit tests, and the new configuration flag is correctly integrated. I have identified a minor issue with the copyright year in the newly added files, and have provided guidance on how to address it, especially if the files are auto-generated.
39049d4 to
f6af215
Compare
146c6cb to
a2d4efa
Compare
a2d4efa to
c75ddfc
Compare
663608f to
8bdec39
Compare
Description
cache-dirat the specified limit offile-cache-max-size-mb.experimental-file-cache-disable-size-calculation-fixto disable the fix, if needed as a fallback for unexpected behavior change.Note: This fix makes no differences to the cases where
file-cache-max-size-mbis specified as-1, which is the default value for that flag. So, this fix doesn't change anything unlessfile-cache-max-size-mbis explicitly set to a value other than-1.Note: This fix doesn't completely ensure that the actual disk-size of
cache-dirdoesn't go beyondfile-cache-max-size-mbas it doesn't address the space taken up by directories incache-dir. It also doesn't clear up empty directories incache-dir. For that, more changes are planned in further phases.Note: This fix doesn't apply when the cache is sparse-enabled (i.e.
file-cache-experimental-enable-chunk-cache=true), as the file-size rounding-up is not needed for them, so this fix is disabled for that case.Link to the issue in case of a bug fix.
b/496408968
Testing details
file-cache-max-size-mbset to1andnumber of directories ~= number of filesin the read workload. With that set up, I let a multi-thread read application run for about 2 minutes. Saw the effect of the change.du -sh <cache-dir>) grew up to 4.5 MBAny backward incompatible change? If so, please explain.
The fix can potentially change the user experience indirectly, as the
cache-dirdisk-utilization will now be closer tofile-cache-max-size-mb, and it can reduce how many files are actually cached, if there are many small files.