Skip to content

Update computation logic for pod count#1862

Merged
mbvreddy merged 7 commits intokruize:pod-count-fixfrom
mbvreddy:compute-logic
Apr 6, 2026
Merged

Update computation logic for pod count#1862
mbvreddy merged 7 commits intokruize:pod-count-fixfrom
mbvreddy:compute-logic

Conversation

@mbvreddy
Copy link
Copy Markdown

@mbvreddy mbvreddy commented Apr 2, 2026

Description

This PR contains changes to include following section for each recommendation term.

"metrics_info": 
{
  "pod_count": 
  {
    "avg": 2,
    "min": 2,
    "max": 2
  }
}

At high level, following changes are made part of this PR

  1. Added constants related json response in KruizeConstants.JSONKeys
  2. Added new metric podCount to Enum MetricName
  3. Created new method checkIfMinDataAvailableForTerm to check min data-points in efficient way. Our current implementation assumes 15 mins measurement duration would never change in ROS. So, new method also is inline with that assumption.
  4. Added new instance variable in com.autotune.analyzer.recommendations.objects.TermRecommendations to support nee section metrics_info.
  5. com.autotune.analyzer.recommendations.engine.NamespaceRecommendationProcessor is updated to remove computation of no.of pods as it is not applicable for namespace recommendations as confirmed in previous comments.
  6. com.autotune.analyzer.recommendations.engine.ContainerRecommendationProcessor is updated to compute avg, min and max pod_count.
  7. I use existing model MetricAggregationInfoResults to hold the pod count. Note that this model define aggregations as Double. So, in response, we see “2.0” instead of “2”. If this has to be sent as “2”, we need to create new model object that holds aggregations as int instead of double. - Fixed by using custom serializer.
  8. Optimized code to filter results at recommendation term instead of recommendation model. This helps in elimination of redundant code to filter results multiple times. As part of this com.autotune.analyzer.recommendations.engine.ContainerRecommendationProcessor#generateRecommendationBasedOnModel method signature is modified to remove unused parameters monitoringStartTime and monitoringEndTime

Fixes # (issue)

Type of change

  • Bug fix
  • New feature
  • Docs update
  • Breaking change (What changes might users need to make in their application due to this PR?)
  • Requires DB changes

How has this been tested?

Please describe the tests that were run to verify your changes and steps to reproduce. Please specify any test configuration required.

  • New Test X
  • Functional testsuite

Test Configuration

  • Kubernetes clusters tested on: Kind
  • Used remote monitoring scripts to deploy and test.

Checklist 🎯

  • Followed coding guidelines
  • Comments added
  • Dependent changes merged
  • Documentation updated
  • Tests added or updated

Additional information

Include any additional information such as links, test results, screenshots here

Summary by Sourcery

Add per-term metrics metadata for pod count and refactor recommendation generation to use pre-filtered metric windows and metric-based data-availability checks.

New Features:

  • Expose a metrics_info section on each recommendation term, including aggregated pod_count statistics (avg, min, max).
  • Support a dedicated podCount metric and derive pod count aggregations from podCount, cpuUsage, or memoryUsage data as available.

Enhancements:

  • Refactor container and namespace recommendation processing to compute and reuse a filtered metrics window per term instead of re-filtering in each model.
  • Introduce generic, metrics-based minimum data checks for terms at container and namespace level, decoupling them from raw data objects.
  • Simplify the recommendation engine populateRecommendation API by removing explicit numPods handling and related validations, aligning pod count handling with the new metrics_info aggregation.
  • Update term threshold configuration comments to document the minimum datapoint expectations for each term type.

Summary by Sourcery

Add per-term metrics metadata for pod count and refactor recommendation processing to use pre-filtered metric windows and metric-based data availability checks.

New Features:

  • Expose a metrics_info section on each recommendation term including aggregated pod_count statistics.
  • Introduce a dedicated podCount metric and compute pod count aggregations from available metrics data.

Enhancements:

  • Refactor container and namespace recommendation processors to compute a filtered metrics window once per term and reuse it across models.
  • Simplify the recommendation engine populateRecommendation API by removing explicit pod count handling and validations, aligning with metrics-based pod count aggregation.
  • Add generic metrics-based minimum datapoint checks for container and namespace terms based on term thresholds.
  • Register a custom MetricAggregationInfoResults serializer to emit integer-valued aggregations in recommendation responses and document expected term datapoint thresholds via inline comments.

Signed-off-by: Bhaktavatsal Reddy <bhamaram@in.ibm.com>
@sourcery-ai
Copy link
Copy Markdown
Contributor

sourcery-ai Bot commented Apr 2, 2026

Reviewer's Guide

Adds per-term pod_count metrics_info aggregation to container recommendations, refactors recommendation processing to use pre-filtered metric windows and metric-based minimum data checks, and removes explicit pod count handling from the core engine and namespace recommendations while introducing a custom serializer to emit integer metric aggregations.

Sequence diagram for container term recommendation with pod_count metrics_info

sequenceDiagram
    actor User
    participant API as UpdateRecommendations
    participant Engine as RecommendationEngine
    participant ContProc as ContainerRecommendationProcessor
    participant Terms
    participant Model as RecommendationModel
    participant TermRec as TermRecommendations
    participant Gson as Gson
    participant Serializer as MetricAggregationInfoResultsIntSerializer

    User->>API: trigger recommendation update
    API->>Engine: generateRecommendations(kruizeObject)

    loop for each container and term
        Engine->>ContProc: generateRecommendationsBasedOnTerms(containerData, kruizeObject, termsMap)
        ContProc->>ContProc: compute monitoringStartTime, monitoringEndTime
        ContProc->>ContProc: build filteredResultsMap from containerData.results
        ContProc->>Terms: checkIfMinDataAvailableForTerm(filteredResultsMap, term, measurementDuration)
        Terms-->>ContProc: boolean hasMinData

        alt not enough data
            ContProc->>TermRec: new TermRecommendations
            ContProc->>TermRec: addNotification(INFO_NOT_ENOUGH_DATA)
        else enough data
            ContProc->>TermRec: new TermRecommendations
            ContProc->>ContProc: getPodCountAggrInfo(filteredResultsMap)
            ContProc-->>ContProc: MetricAggregationInfoResults podCountAggrInfo
            ContProc->>TermRec: addMetricsInfo(POD_COUNT, podCountAggrInfo)

            loop for each RecommendationModel
                Engine->>Engine: getModels()
                Engine-->>ContProc: model list
                ContProc->>Model: getCPURequestRecommendation(filteredResultsMap, notifications)
                ContProc->>Model: getMemoryRequestRecommendation(filteredResultsMap, notifications)
                ContProc->>Model: getAcceleratorRequestRecommendation(filteredResultsMap, notifications)
                ContProc->>Engine: populateRecommendation(termEntry, mappedRecommendationForModel, notifications, internalMapToPopulate, cpuThreshold, memoryThreshold, acceleratorMap, runtimeRecommList)
                Engine-->>ContProc: boolean success
            end
        end
    end

    API->>Gson: toJson(recommendationList)
    Gson->>Serializer: serialize(MetricAggregationInfoResults)
    Serializer-->>Gson: Json with avg, min, max as int
    Gson-->>API: JSON payload with metrics_info.pod_count
    API-->>User: HTTP response with updated recommendations
Loading

Class diagram for updated recommendation processing and pod_count metrics_info

classDiagram
    class ContainerRecommendationProcessor {
        - RecommendationEngineService engineService
        + boolean generateRecommendationsBasedOnTerms(ContainerData containerData, KruizeObject kruizeObject, Map~String, Terms~ termsMap)
        - MappedRecommendationForModel generateRecommendationBasedOnModel(RecommendationModel model, ContainerData containerData, Map~Timestamp, IntervalResults~ filteredResultsMap, KruizeObject kruizeObject, HashMap~AnalyzerConstants_ResourceSetting, HashMap~AnalyzerConstants_RecommendationItem, RecommendationConfigItem~~ currentConfigMap, Map_Entry~String, Terms~ termEntry)
        - static MetricAggregationInfoResults getPodCountAggrInfo(Map~Timestamp, IntervalResults~ filteredResultsMap)
    }

    class NamespaceRecommendationProcessor {
        - RecommendationEngineService engineService
        + boolean generateNamespaceRecommendationsBasedOnTerms(NamespaceData namespaceData, KruizeObject kruizeObject, Map~String, Terms~ termsMap)
        - MappedRecommendationForModel generateNamespaceRecommendationBasedOnModel(RecommendationModel model, Map~Timestamp, IntervalResults~ filteredResultsMap, RecommendationSettings recommendationSettings, HashMap~AnalyzerConstants_ResourceSetting, HashMap~AnalyzerConstants_RecommendationItem, RecommendationConfigItem~~ currentNamespaceConfigMap, Map_Entry~String, Terms~ termEntry)
    }

    class Terms {
        + int days
        + double threshold_in_days
        + static int getMaxDays(Map~String, Terms~ terms)
        + static boolean checkIfMinDataAvailableForTerm(Map~Timestamp, IntervalResults~ metricsData, Terms term, double measurementDuration)
        + static boolean checkIfMinDataAvailableForTerm(ContainerData containerData, Terms term, Timestamp monitoringEndTime, double measurementDuration)
        + static boolean checkIfMinDataAvailableForTermForNamespace(Map~Timestamp, IntervalResults~ metricsData, Terms term, double measurementDuration)
        + static boolean checkIfMinDataAvailableForTermForNamespace(NamespaceData namespaceData, Terms term, Timestamp monitoringEndTime, double measurementDuration)
    }

    class TermRecommendations {
        - HashMap~Integer, RecommendationNotification~ termLevelNotificationMap
        - HashMap~String, MetricAggregationInfoResults~ metricsInfo
        - Timestamp monitoringStartTime
        - Timestamp monitoringEndTime
        + void addNotification(RecommendationNotification notification)
        + void addMetricsInfo(String metricName, MetricAggregationInfoResults metricAggregationInfoResults)
        + void setMonitoringStartTime(Timestamp monitoringStartTime)
        + void setMonitoringEndTime(Timestamp monitoringEndTime)
    }

    class RecommendationEngineService {
        <<interface>>
        + boolean populateRecommendation(Map_Entry~String, Terms~ termEntry, MappedRecommendationForModel recommendationModel, ArrayList~RecommendationNotification~ notifications, HashMap~String, RecommendationConfigItem~ internalMapToPopulate, double cpuThreshold, double memoryThreshold, Map~AnalyzerConstants_RecommendationItem, RecommendationConfigItem~ recommendationAcceleratorRequestMap, List~RecommendationConfigEnv~ runtimeRecommList)
        + List~RecommendationModel~ getModels()
    }

    class RecommendationEngine {
        + boolean populateRecommendation(Map_Entry~String, Terms~ termEntry, MappedRecommendationForModel recommendationModel, ArrayList~RecommendationNotification~ notifications, HashMap~String, RecommendationConfigItem~ internalMapToPopulate, double cpuThreshold, double memoryThreshold, Map~AnalyzerConstants_RecommendationItem, RecommendationConfigItem~ recommendationAcceleratorRequestMap, List~RecommendationConfigEnv~ runtimeRecommList)
    }

    class MetricAggregationInfoResults {
        + Double avg
        + Double sum
        + Double min
        + Double max
        + Long count
        + Double median
        + Double mode
        + Double range
        + String format
        + Double getAvg()
        + Double getSum()
        + Double getMin()
        + Double getMax()
        + Long getCount()
        + Double getMedian()
        + Double getMode()
        + Double getRange()
        + String getFormat()
        + void setAvg(Double avg)
        + void setMin(Double min)
        + void setMax(Double max)
    }

    class MetricAggregationInfoResultsIntSerializer {
        <<JsonSerializer~MetricAggregationInfoResults~~>>
        + JsonElement serialize(MetricAggregationInfoResults metricAggregationInfoResults, Type type, JsonSerializationContext jsonSerializationContext)
    }

    class UpdateRecommendations {
        + boolean shouldSkipClass(Class clazz)
        + void someMethodBuildingGson()
    }

    class KruizeConstants_JSONKeys {
        <<static>>
        + String METRICS_INFO
        + String POD_COUNT
        + String PODS_COUNT
    }

    class AnalyzerConstants_MetricName {
        <<enum>>
        cpuUsage
        memoryUsage
        podCount
        namespaceTotalPods
        jvmInfo
        jvmInfoTotal
    }

    class KruizeObject {
        + static double getTermThresholdInDays(String term, Double measurement_duration)
    }

    ContainerRecommendationProcessor --> RecommendationEngineService : uses
    NamespaceRecommendationProcessor --> RecommendationEngineService : uses
    RecommendationEngineService <|.. RecommendationEngine
    ContainerRecommendationProcessor --> Terms : uses
    NamespaceRecommendationProcessor --> Terms : uses
    ContainerRecommendationProcessor --> TermRecommendations : creates
    NamespaceRecommendationProcessor --> TermRecommendations : creates
    TermRecommendations --> MetricAggregationInfoResults : contains
    ContainerRecommendationProcessor --> MetricAggregationInfoResults : computes
    ContainerRecommendationProcessor --> AnalyzerConstants_MetricName : uses
    NamespaceRecommendationProcessor --> AnalyzerConstants_MetricName : uses
    UpdateRecommendations --> MetricAggregationInfoResultsIntSerializer : registers
    UpdateRecommendations --> MetricAggregationInfoResults : serializes
    MetricAggregationInfoResultsIntSerializer --> MetricAggregationInfoResults : serializes
    KruizeObject --> KruizeConstants_JSONKeys : uses
    TermRecommendations --> KruizeConstants_JSONKeys : uses
    ContainerRecommendationProcessor --> KruizeConstants_JSONKeys : uses
    Terms --> KruizeConstants_JSONKeys : uses
Loading

File-Level Changes

Change Details Files
Compute and expose per-term pod_count aggregations (avg, min, max) on container recommendations using a new metrics_info section and podCount metric.
  • Filter container metric results once per term and reuse the filtered window across all models.
  • Use a new getPodCountAggrInfo helper to derive pod count aggregations from podCount, cpuUsage, or memoryUsage metrics with validation of aggregation fields.
  • Attach pod_count MetricAggregationInfoResults to each TermRecommendations via addMetricsInfo using the new JSONKeys.METRICS_INFO and POD_COUNT constants.
  • Stop passing monitoringStartTime/monitoringEndTime into generateRecommendationBasedOnModel, instead passing the pre-filtered results map.
src/main/java/com/autotune/analyzer/recommendations/engine/ContainerRecommendationProcessor.java
src/main/java/com/autotune/analyzer/recommendations/objects/TermRecommendations.java
src/main/java/com/autotune/utils/KruizeConstants.java
src/main/java/com/autotune/analyzer/utils/AnalyzerConstants.java
Refactor minimum-data checks for terms to operate directly on metrics windows instead of container/namespace data objects.
  • Introduce overloaded checkIfMinDataAvailableForTerm and checkIfMinDataAvailableForTermForNamespace that take a Map<Timestamp, IntervalResults>.
  • Compute required minimum datapoints from term.threshold_in_days, measurement duration, and time conversion constants.
  • Update container and namespace term recommendation flows to call the new overloads using pre-filtered result maps.
src/main/java/com/autotune/analyzer/recommendations/term/Terms.java
src/main/java/com/autotune/analyzer/recommendations/engine/ContainerRecommendationProcessor.java
src/main/java/com/autotune/analyzer/recommendations/engine/NamespaceRecommendationProcessor.java
src/main/java/com/autotune/utils/KruizeConstants.java
src/main/java/com/autotune/analyzer/kruizeObject/KruizeObject.java
Remove explicit pod count computation/validation from the core recommendation engine and namespace recommendations, aligning pod handling with metrics-based metrics_info.
  • Drop numPods parameter from RecommendationEngine.populateRecommendation API and RecommendationEngineService interface.
  • Remove engine-level numPods zero/negative validation and podsCount mutation; Container/Namespace processors no longer compute or pass numPods.
  • Delete getNumPods and getNumPodsForNamespace helpers and related namespaceTotalPods usage.
  • Simplify namespace recommendation model generation to just operate on filtered metrics data and thresholds.
src/main/java/com/autotune/analyzer/recommendations/engine/RecommendationEngine.java
src/main/java/com/autotune/analyzer/recommendations/engine/RecommendationEngineService.java
src/main/java/com/autotune/analyzer/recommendations/engine/ContainerRecommendationProcessor.java
src/main/java/com/autotune/analyzer/recommendations/engine/NamespaceRecommendationProcessor.java
Ensure pod_count aggregations serialize as integers in recommendation responses while still using MetricAggregationInfoResults internally.
  • Introduce MetricAggregationInfoResultsIntSerializer to emit integer-valued aggregation fields (avg, sum, min, max, median, mode, range) and preserve format.
  • Register the custom serializer in UpdateRecommendations when building the Gson instance so all MetricAggregationInfoResults are serialized via this adapter.
src/main/java/com/autotune/analyzer/adapters/MetricAggregationInfoResultsIntSerializer.java
src/main/java/com/autotune/analyzer/services/UpdateRecommendations.java

Possibly linked issues

  • #unknown: PR implements term-level pod_count metrics_info (min, max, avg) exactly fulfilling the replica info requested in issue
  • #Improvise the data processing mechanism before sending the results to recommendation engine: PR implements per-term filteredResults reuse and term-level min-data checks, directly addressing the issue’s data duplication and perf concerns.

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

Copy link
Copy Markdown
Contributor

@sourcery-ai sourcery-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 1 issue, and left some high level feedback:

  • The new filtering for filteredResultsMap uses after(monitoringStartTime) and before(monitoringEndTime), whereas the previous logic used >= / <=; if you still intend to include samples exactly at the boundaries, you should switch to !before(start) / !after(end) or use compareTo as before.
  • Removing numPods from populateRecommendation and the associated validation while introducing metrics_info.pod_count changes where pod count is enforced and surfaced; consider either wiring the new pod count aggregation back into MappedRecommendationForModel.setPodsCount or clearly consolidating on one representation to avoid pods_count silently remaining unset.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- The new filtering for `filteredResultsMap` uses `after(monitoringStartTime)` and `before(monitoringEndTime)`, whereas the previous logic used `>=` / `<=`; if you still intend to include samples exactly at the boundaries, you should switch to `!before(start)` / `!after(end)` or use `compareTo` as before.
- Removing `numPods` from `populateRecommendation` and the associated validation while introducing `metrics_info.pod_count` changes where pod count is enforced and surfaced; consider either wiring the new pod count aggregation back into `MappedRecommendationForModel.setPodsCount` or clearly consolidating on one representation to avoid `pods_count` silently remaining unset.

## Individual Comments

### Comment 1
<location path="src/main/java/com/autotune/analyzer/recommendations/engine/ContainerRecommendationProcessor.java" line_range="336-345" />
<code_context>
+        // 2. Calculate from 'cpuUsage' datapoints using formulae avg of 'sum/avg', min of 'sum/avg', max of 'sum/avg'
</code_context>
<issue_to_address>
**issue (bug_risk):** Division by zero / Infinity risk when computing pod counts from cpuUsage metrics.

In the `cpuUsageMetrics` branch, `aggInfo.getAvg()` is only checked for non-null, so `aggInfo.getSum() / aggInfo.getAvg()` can divide by zero or near-zero values, causing exceptions or `Infinity`/`NaN` that propagate into `avg`, `min`, and `max`. The same risk exists in the memory branch. Please guard against `aggInfo.getAvg() == 0` (or effectively zero) before dividing, e.g., by filtering those entries or routing them to the error path instead of including them in the aggregates.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

mbvreddy added 3 commits April 2, 2026 10:57
Signed-off-by: Bhaktavatsal Reddy <bhamaram@in.ibm.com>
Signed-off-by: Bhaktavatsal Reddy <bhamaram@in.ibm.com>
Signed-off-by: Bhaktavatsal Reddy <bhamaram@in.ibm.com>
@mbvreddy mbvreddy requested a review from khansaad April 3, 2026 05:08
@mbvreddy
Copy link
Copy Markdown
Author

mbvreddy commented Apr 3, 2026

@sourcery-ai review

@khansaad khansaad added the enhancement New feature or request label Apr 6, 2026
@khansaad khansaad moved this to Under Review in Monitoring Apr 6, 2026
@khansaad khansaad added this to the Kruize 0.11.0 Release milestone Apr 6, 2026
mbvreddy added 3 commits April 6, 2026 13:26
…ce and containers

Signed-off-by: Bhaktavatsal Reddy <bhamaram@in.ibm.com>
Signed-off-by: Bhaktavatsal Reddy <bhamaram@in.ibm.com>
Signed-off-by: Bhaktavatsal Reddy <bhamaram@in.ibm.com>
@mbvreddy
Copy link
Copy Markdown
Author

mbvreddy commented Apr 6, 2026

@sourcery-ai review

Copy link
Copy Markdown
Contributor

@sourcery-ai sourcery-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 1 issue, and left some high level feedback:

  • Registering MetricAggregationInfoResultsIntSerializer globally in UpdateRecommendations means all MetricAggregationInfoResults in the response are coerced to ints; if only pod_count needs integer output, consider scoping the serializer to that field or using a dedicated DTO to avoid truncating other metric aggregations.
  • getPodCountAggrInfo logs full metric contents at INFO level and returns null on failure; consider downgrading these logs to DEBUG and returning a non-null MetricAggregationInfoResults with zeroed values so callers don’t need to handle null and logs stay less noisy in normal operation.
  • The new Terms.checkIfMinDataAvailableForTerm(Map...) and checkIfMinDataAvailableForTermForNamespace(Map...) duplicate logic with the existing container/namespace versions; it would be cleaner to centralize the common datapoint-count calculation in a single helper to ensure future threshold changes stay consistent across all callers.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- Registering MetricAggregationInfoResultsIntSerializer globally in UpdateRecommendations means *all* MetricAggregationInfoResults in the response are coerced to ints; if only pod_count needs integer output, consider scoping the serializer to that field or using a dedicated DTO to avoid truncating other metric aggregations.
- getPodCountAggrInfo logs full metric contents at INFO level and returns null on failure; consider downgrading these logs to DEBUG and returning a non-null MetricAggregationInfoResults with zeroed values so callers don’t need to handle null and logs stay less noisy in normal operation.
- The new Terms.checkIfMinDataAvailableForTerm(Map...) and checkIfMinDataAvailableForTermForNamespace(Map...) duplicate logic with the existing container/namespace versions; it would be cleaner to centralize the common datapoint-count calculation in a single helper to ensure future threshold changes stay consistent across all callers.

## Individual Comments

### Comment 1
<location path="src/main/java/com/autotune/analyzer/recommendations/engine/ContainerRecommendationProcessor.java" line_range="313-322" />
<code_context>
+    private static MetricAggregationInfoResults getPodCountAggrInfo(Map<Timestamp, IntervalResults> filteredResultsMap) {
</code_context>
<issue_to_address>
**suggestion (bug_risk):** Return-null behavior from getPodCountAggrInfo can cause confusing logs and makes callers responsible for dealing with failure.

Currently this returns `null` and logs an error when no pod count can be derived, while callers log `podCountAggrInfo` (possibly `null`) and `addMetricsInfo` silently ignores it. This yields an error log with no clear API-level signal that pod-count data is unavailable and increases the risk of future NPEs if a caller forgets the null-check. Prefer returning a `MetricAggregationInfoResults` instance with explicit sentinel values or a flag to indicate “no data,” and consider downgrading the log level if this is an expected condition.

Suggested implementation:

```java
     *
     * @param filteredResultsMap
     * @return aggregated results like min, max, avg of pods from the results supplied. Never {@code null}.
     *         Use {@link MetricAggregationInfoResults#hasData()} to detect when pod-count data could not be derived.
     */
    private static MetricAggregationInfoResults getPodCountAggrInfo(Map<Timestamp, IntervalResults> filteredResultsMap) {
        MetricAggregationInfoResults metricAggregationInfoResults = new MetricAggregationInfoResults();
        // Default to "no data"; set to true only when we successfully derive pod-count metrics.
        metricAggregationInfoResults.setHasData(false);
        Double avg = 0.0, min = 0.0, max = 0.0;
        LOGGER.info("filteredResultsMap: size = {}", filteredResultsMap.size());

```

To fully implement the behavior you described, the following additional changes are needed elsewhere in the codebase:

1. **Update `MetricAggregationInfoResults` to carry an explicit “has data” flag:**
   - In the `MetricAggregationInfoResults` class, add:
     - A private boolean field, e.g. `private boolean hasData;`
     - Public getter/setter: `public boolean hasData()` and `public void setHasData(boolean hasData)`
     - Initialize `hasData` to `false` by default (either in the field declaration or in constructors).
   - Ensure `MetricAggregationInfoResults` instances created in other places set `hasData(true)` when they contain valid aggregation data.

2. **Mark successful pod-count aggregation in `getPodCountAggrInfo`:**
   - In the body of `getPodCountAggrInfo`, wherever you currently:
     - Successfully compute `avg`, `min`, `max` for pod count and populate `metricAggregationInfoResults`,
   - Add `metricAggregationInfoResults.setHasData(true);` right after setting the computed values.

3. **Remove null-return behavior and adjust logging in `getPodCountAggrInfo`:**
   - Locate any branches in `getPodCountAggrInfo` that currently:
     - Log an error (likely `LOGGER.error(...)`) because no pod count could be derived and then `return null;`.
   - Replace those with:
     - A downgraded log level (`LOGGER.info` or `LOGGER.debug`) with a message like: `"No pod-count aggregation data could be derived; returning MetricAggregationInfoResults with hasData=false"`.
     - `return metricAggregationInfoResults;` (which will have `hasData` already set to `false`).
   - After these changes, **ensure the method never returns `null`**.

4. **Update callers of `getPodCountAggrInfo`:**
   - Replace any null checks like:
     - `if (podCountAggrInfo != null) { ... }`
   - With explicit flag checks:
     - `if (podCountAggrInfo.hasData()) { ... } else { /* handle "no data" case if needed */ }`
   - In `addMetricsInfo` (and any similar methods), instead of silently ignoring a `null` result, they should:
     - Always accept a non-null `MetricAggregationInfoResults`.
     - Optionally log or branch behavior based on `hasData()` if they need to distinguish “no data” from “zero-valued data.5. **Align logging expectations:**
   - Since “no pod-count data” is expected in some situations, ensure that:
     - There are no remaining `LOGGER.error` calls solely for the absence of pod-count data.
     - Such conditions, if logged, use `info` or `debug` and clearly state that the API will return a non-null object with `hasData=false`.

These additional changes will remove the confusing null-return behavior, provide a clear API-level signal when pod-count data is unavailable, and reduce the risk of future NPEs from missing null checks.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Copy link
Copy Markdown
Contributor

@khansaad khansaad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@mbvreddy mbvreddy merged commit cf444bc into kruize:pod-count-fix Apr 6, 2026
3 checks passed
@github-project-automation github-project-automation Bot moved this from Under Review to Done in Monitoring Apr 6, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

2 participants