Skip to content

hypervisor/kubevirt: fix CPU metrics always reporting zero#5896

Open
naiming-zededa wants to merge 1 commit intolf-edge:masterfrom
naiming-zededa:naiming-vmi-cpu-fix
Open

hypervisor/kubevirt: fix CPU metrics always reporting zero#5896
naiming-zededa wants to merge 1 commit intolf-edge:masterfrom
naiming-zededa:naiming-vmi-cpu-fix

Conversation

@naiming-zededa
Copy link
Copy Markdown
Contributor

Description

GetDomsCPUMem was matching against kubevirt_vmi_cpu_usage_seconds, but
the virt-handler emits the metric with a _total suffix:
kubevirt_vmi_cpu_usage_seconds_total. The mismatch caused CPUTotalNs
to remain 0 for all app VMs regardless of load.

Also guard against uint32 underflow in UsedMemory when guest agent
balloon data is stale (usable_bytes > available_bytes).

PR dependencies

How to test and validate this PR

Configure an edge-node cluster, and deploy some VMIs on the cluster.
Monitoer the App Instance CPU usage in UI, while generate some activities,
for example, using 'stress-ng' tools.

Changelog notes

hypervisor/kubevirt: fix CPU metrics always reporting zero

PR Backports

Checklist

  • I've provided a proper description
  • I've added the proper documentation
  • I've tested my PR on amd64 device
  • I've tested my PR on arm64 device
  • I've written the test verification instructions
  • I've set the proper labels to this PR

For backport PRs (remove it if it's not a backport):

  • I've added a reference link to the original PR
  • PR's title follows the template

And the last but not least:

  • I've checked the boxes above, or I've provided a good reason why I didn't
    check them.

Please, check the boxes above after submitting the PR in interactive mode.

Copy link
Copy Markdown

@zedi-pramodh zedi-pramodh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@codecov
Copy link
Copy Markdown

codecov Bot commented May 4, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 17.11%. Comparing base (2281599) to head (a9b4f3e).
⚠️ Report is 644 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #5896      +/-   ##
==========================================
- Coverage   19.52%   17.11%   -2.42%     
==========================================
  Files          19      474     +455     
  Lines        3021    85692   +82671     
==========================================
+ Hits          590    14664   +14074     
- Misses       2310    69511   +67201     
- Partials      121     1517    +1396     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

if r.UsedMemory > r.AvailableMemory {
r.UsedMemory = r.UsedMemory - r.AvailableMemory
} else {
r.UsedMemory = 0
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The usage of this variable name (UsedMemory) is super confusing here. The variable UsedMemory is actually assigned the total available memory at line 143: https://github.com/lf-edge/eve/pull/5896/changes#diff-12854d8763366bf5e65d3665b532bd7cf18078a2b50bad3ffa7500965c5c952aL143

Just later on (here) the proper calculation is performed. According to my understanding, in case where UsedMemory (at this point = Total memory) < AvailableMemory, it means there is no free memory, but then UsedMemory is set to 0.... @naiming-zededa , could you please clarify?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rene you are right. changed this to 'domainAvailMB', updated.

  GetDomsCPUMem was matching against kubevirt_vmi_cpu_usage_seconds, but
  the virt-handler emits the metric with a _total suffix:
  kubevirt_vmi_cpu_usage_seconds_total. The mismatch caused CPUTotalNs
  to remain 0 for all app VMs regardless of load.

  Also guard against uint32 underflow in UsedMemory when guest agent
  balloon data is stale (usable_bytes > available_bytes).

Signed-off-by: naiming-zededa <naiming@zededa.com>
@naiming-zededa naiming-zededa force-pushed the naiming-vmi-cpu-fix branch from 5320a0a to a9b4f3e Compare May 5, 2026 19:18
@github-actions github-actions Bot requested a review from rene May 5, 2026 19:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants