Migration: Aggregate Dirty Log in Thread, Reduce Migration Downtime#82
Draft
phip1611 wants to merge 4 commits intocyberus-technology:gardenlinuxfrom
Draft
Migration: Aggregate Dirty Log in Thread, Reduce Migration Downtime#82phip1611 wants to merge 4 commits intocyberus-technology:gardenlinuxfrom
phip1611 wants to merge 4 commits intocyberus-technology:gardenlinuxfrom
Conversation
f82765c to
4e9ab64
Compare
4e9ab64 to
f403533
Compare
f403533 to
47a44c8
Compare
47a44c8 to
903e297
Compare
To aggregate the dirty log in a thread asynchronously, we need to be able to properly merge MemoryRangeTables into each other to prevent transmitting the same memory multiple times. On-behalf-of: SAP philipp.schuster@sap.com Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>
0df55d4 to
4c033b4
Compare
Comment on lines
+43
to
+56
| /// All shared state of [`DirtyLogWorker`] that is behind the same lock. | ||
| struct DirtyLogWorkerProtectedState { | ||
| /// The dirty rates measured in the past [`DIRTY_RATE_CALC_TIMESLICE`]. | ||
| /// | ||
| /// Used to calculate the dirty rate. | ||
| dirty_rates_pps: VecDeque<u64>, | ||
| /// The constantly updated (and merged) memory range table since the data | ||
| /// was moved out of the struct the last time. | ||
| table: MemoryRangeTable, | ||
| /// The timestamp of the last processing, used to calculate the dirty rate. | ||
| last_timestamp: Instant, | ||
| /// Set to true to signal the worker thread to stop and exit. | ||
| stop: bool, | ||
| } |
There was a problem hiding this comment.
What does "Protected" mean here?
Does being behind the same lock imply that there are no locks within the struct? If yes, I'd prefer to document that.
Member
Author
There was a problem hiding this comment.
Fair question! I'm looking for a new name
There was a problem hiding this comment.
Yes thanks!
I'm still unsure about the comment regarding being behind the same lock.
Comment on lines
+183
to
+186
| /// Starts the thread and let it run until [`DirtyLogWorkerHandle::stop`] is called. | ||
| pub fn run(self) -> Result<(), MigratableError /* dirty log error */> { | ||
| info!("starting thread"); | ||
|
|
There was a problem hiding this comment.
Nit: this doesn't start the thread, it's the method that's called by the thread.
There was a problem hiding this comment.
The doc comment still refers to starting the thread.
Suggested change
| /// Starts the thread and let it run until [`DirtyLogWorkerHandle::stop`] is called. | |
| pub fn run(self) -> Result<(), MigratableError /* dirty log error */> { | |
| info!("starting thread"); | |
| /// Fetches the dirty log and updates the internal metrics. | |
| /// | |
| /// Thread entry function, executed until [`DirtyLogWorkerHandle::stop`] is called. | |
| pub fn run(self) -> Result<(), MigratableError /* dirty log error */> { | |
| info!("thread started"); | |
4c033b4 to
7b7bcda
Compare
This adds the basic plumbing for the DirtyLogWorker which will fetch the dirty log asynchronously in background and aggregate the effective MemoryRangeTable with dirtied memory. # Motivation - Performance: Fetching the dirty log, parsing the dirty bitmap, and aggregating the corresponding data structures is fairly costly. I just ran a vM with a active working set of 5 GiB (with 4 workers) and the measured overhead per iteration was 10-20ms. Given that we want to have as small downtimes as possible, we want that overhead to be close to zero for the final iteration. - Accurate dirty rate: This way, we have a more fine-grained sampling of the dirty rate (dirties 4k pages per second) which is an interesting metric to observe the current workload (regarding memory writes). # Bigger Picture / Outlook to KVMs Dirty Ring Interface The most robust and performant version which Cloud Hypervisor should use to get dirtied pages in the future is KVM's dirty ring interface [0]. This requires [1] to be merged first in rust-vmm/kvm. Experience showed that bumping any of the rust-vmm crates is a major challenge as all of them are highly interdependent and developed in individual repositories. So it will take some time before we can even consider starting the work of that feature in CHV. That being said: This design improves the current situation significantly without blocking any future refactorings or replacements with KVM's dirty ring interface. # Design I actively decided against Arc<Mutex<Vm>> in the DirtyLogWorker as this would be very invasive, make the migration code overly complicated (many locks and unlocks at the right times) and lastly, be a very big change to only call `vm.dirty_log()` in the thread. Note that the latter is just a thin wrapper around calling `cpu_manager.dirty_log()` and `memory_manager.dirty_log()`. [0] https://lwn.net/Articles/833206/ [1] rust-vmm/kvm#344 On-behalf-of: SAP philipp.schuster@sap.com Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>
Now, the overhead per precopy iteration drops to 0ms. We, however, have a small overhead of joining the thread, which takes <=1ms in my setup. On-behalf-of: SAP philipp.schuster@sap.com Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>
On-behalf-of: SAP philipp.schuster@sap.com Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>
7b7bcda to
5b245da
Compare
| } | ||
|
|
||
| /// Starts the thread and let it run until [`DirtyLogWorkerHandle::stop`] is called. | ||
| pub fn run(self) -> Result<(), MigratableError /* dirty log error */> { |
There was a problem hiding this comment.
As far as I can tell, this is (and should be) only called by the DirtyLogWorker::spawn method.
Suggested change
| pub fn run(self) -> Result<(), MigratableError /* dirty log error */> { | |
| fn run(self) -> Result<(), MigratableError /* dirty log error */> { |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This adds the basic plumbing for the DirtyLogWorker which will fetch the
dirty log asynchronously in background and aggregate the effective
MemoryRangeTable with dirties memory.
Context
This wasn't planned. I did it as a side-project in the past weeks and finalized my work now! I see this as crucial for production-grade live-migration.
Motivation
aggregating the corresponding data structures is fairly costly. I just
ran a vM with a active working set of 5 GiB (with 4 workers) and the
measured overhead per iteration was 10-20ms. Given that we want to
have as small downtimes as possible, we want that overhead to be
close to zero for the final iteration.
the dirty rate (dirties 4k pages per second) which is an interesting
metric to observe the current workload (regarding memory writes).
Design
I actively decided against Arc<Mutex> in the DirtyLogWorker as
this would be very invasive, make the migration code overly complicated
(many locks and unlocks at the right times) and lastly, be a very big
change to only call
vm.dirty_log()in the thread. Note that the latteris just a thin wrapper around calling
cpu_manager.dirty_log()andmemory_manager.dirty_log().Bigger Picture / Outlook to KVMs Dirty Ring Interface
The most robust and performant version which Cloud Hypervisor should use
to get dirtied pages in the future is KVM's dirty ring interface [0].
This requires [1] to be merged first in rust-vmm/kvm. Experience showed
that bumping any of the rust-vmm crates is a major challenge as all of
them are highly interdependent and developed in individual repositories.
So it will take some time before we can even consider starting the work
of that feature in CHV.
Steps to Undraft
Hints for Reviewers / Testing
Closes https://github.com/cobaltcore-dev/cobaltcore/issues/280