hypervisor/kubevirt: fix CPU metrics always reporting zero#5896
hypervisor/kubevirt: fix CPU metrics always reporting zero#5896naiming-zededa wants to merge 1 commit intolf-edge:masterfrom
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #5896 +/- ##
==========================================
- Coverage 19.52% 17.11% -2.42%
==========================================
Files 19 474 +455
Lines 3021 85692 +82671
==========================================
+ Hits 590 14664 +14074
- Misses 2310 69511 +67201
- Partials 121 1517 +1396 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
| if r.UsedMemory > r.AvailableMemory { | ||
| r.UsedMemory = r.UsedMemory - r.AvailableMemory | ||
| } else { | ||
| r.UsedMemory = 0 |
There was a problem hiding this comment.
The usage of this variable name (UsedMemory) is super confusing here. The variable UsedMemory is actually assigned the total available memory at line 143: https://github.com/lf-edge/eve/pull/5896/changes#diff-12854d8763366bf5e65d3665b532bd7cf18078a2b50bad3ffa7500965c5c952aL143
Just later on (here) the proper calculation is performed. According to my understanding, in case where UsedMemory (at this point = Total memory) < AvailableMemory, it means there is no free memory, but then UsedMemory is set to 0.... @naiming-zededa , could you please clarify?
There was a problem hiding this comment.
@rene you are right. changed this to 'domainAvailMB', updated.
GetDomsCPUMem was matching against kubevirt_vmi_cpu_usage_seconds, but the virt-handler emits the metric with a _total suffix: kubevirt_vmi_cpu_usage_seconds_total. The mismatch caused CPUTotalNs to remain 0 for all app VMs regardless of load. Also guard against uint32 underflow in UsedMemory when guest agent balloon data is stale (usable_bytes > available_bytes). Signed-off-by: naiming-zededa <naiming@zededa.com>
5320a0a to
a9b4f3e
Compare
Description
GetDomsCPUMem was matching against kubevirt_vmi_cpu_usage_seconds, but
the virt-handler emits the metric with a _total suffix:
kubevirt_vmi_cpu_usage_seconds_total. The mismatch caused CPUTotalNs
to remain 0 for all app VMs regardless of load.
Also guard against uint32 underflow in UsedMemory when guest agent
balloon data is stale (usable_bytes > available_bytes).
PR dependencies
How to test and validate this PR
Configure an edge-node cluster, and deploy some VMIs on the cluster.
Monitoer the App Instance CPU usage in UI, while generate some activities,
for example, using 'stress-ng' tools.
Changelog notes
hypervisor/kubevirt: fix CPU metrics always reporting zero
PR Backports
Checklist
For backport PRs (remove it if it's not a backport):
And the last but not least:
check them.
Please, check the boxes above after submitting the PR in interactive mode.