Skip to content

[WIP] Added a experimentCache to be used in createExperiment to check for existing experiments#1782

Closed
ncau wants to merge 58 commits intokruize:masterfrom
ncau:ceAPI
Closed

[WIP] Added a experimentCache to be used in createExperiment to check for existing experiments#1782
ncau wants to merge 58 commits intokruize:masterfrom
ncau:ceAPI

Conversation

@ncau
Copy link
Copy Markdown

@ncau ncau commented Jan 30, 2026

Description

Added a experimentCache to be used in createExperiment to check for existing experiments.

Fixes # (issue)

Type of change

  • Bug fix
  • New feature
  • Docs update
  • Breaking change (What changes might users need to make in their application due to this PR?)
  • Requires DB changes

How has this been tested?

Please describe the tests that were run to verify your changes and steps to reproduce. Please specify any test configuration required.

  • New Test X
  • Functional testsuite

Test Configuration

  • Kubernetes clusters tested on:

Checklist 🎯

  • Followed coding guidelines
  • Comments added
  • Dependent changes merged
  • Documentation updated
  • Tests added or updated

Additional information

Include any additional information such as links, test results, screenshots here

Summary by Sourcery

Introduce an in-memory experiment cache to avoid redundant database lookups when creating and deleting experiments.

New Features:

  • Add an ExperimentCache utility to track existing experiment names using an in-memory set backed by on-demand database lookups.
  • Enforce conflict responses when attempting to create an experiment with a name that already exists.

Enhancements:

  • Reuse singleton Gson and ExperimentDBService instances within CreateExperiment to avoid repeated object creation.
  • Simplify experiment deletion by relying on the new cache instead of preloading experiments into an in-memory map.

pinkygupta-hub and others added 12 commits January 23, 2026 12:12
Signed-off-by: Pinky Gupta <pinky.gupta1@ibm.com>
Signed-off-by: Pinky Gupta <pinky.gupta1@ibm.com>
Signed-off-by: Pinky Gupta <pinky.gupta1@ibm.com>
Signed-off-by: Pinky Gupta <pinky.gupta1@ibm.com>
Signed-off-by: Pinky Gupta <pinky.gupta1@ibm.com>
Signed-off-by: Pinky Gupta <pinky.gupta1@ibm.com>
Signed-off-by: Pinky Gupta <pinky.gupta1@ibm.com>
Signed-off-by: Pinky Gupta <pinky.gupta1@ibm.com>
Signed-off-by: Pinky Gupta <pinky.gupta1@ibm.com>
Signed-off-by: Pinky Gupta <pinky.gupta1@ibm.com>
Copilot AI review requested due to automatic review settings January 30, 2026 11:24
@ncau ncau added this to Monitoring Jan 30, 2026
@sourcery-ai
Copy link
Copy Markdown
Contributor

sourcery-ai Bot commented Jan 30, 2026

Reviewer's Guide

Introduce an in-memory ExperimentCache to short-circuit DB lookups when creating and deleting experiments, and refactor CreateExperiment to reuse singleton Gson and ExperimentDBService instances while wiring the cache into create/delete flows.

Sequence diagram for createExperiment POST flow with ExperimentCache

sequenceDiagram
    actor Client
    participant CreateExperiment
    participant ExperimentCache
    participant ExperimentDBService

    Client->>CreateExperiment: HTTP POST /createExperiment
    CreateExperiment->>CreateExperiment: parse body to List CreateExperimentAPIObject
    loop for each CreateExperimentAPIObject
        CreateExperiment->>ExperimentCache: isExists(experimentName)
        alt experiment exists
            ExperimentCache-->>CreateExperiment: true
            CreateExperiment-->>Client: 409 Conflict (Experiment name already exists)
        else experiment does not exist
            ExperimentCache-->>CreateExperiment: false
            CreateExperiment->>CreateExperiment: setExperiment_id, setStatus
            CreateExperiment->>CreateExperiment: validate to KruizeObject
        end
    end
    CreateExperiment->>CreateExperiment: check all KruizeObject validation_data
    alt all experiments valid
        loop for each valid experiment
            CreateExperiment->>ExperimentDBService: addExperimentToDB(apiObject)
            ExperimentDBService-->>CreateExperiment: ValidationOutputData
            alt add success
                CreateExperiment->>ExperimentCache: add(experimentName)
            else add failed
                CreateExperiment->>CreateExperiment: mark allSuccessful = false
            end
        end
        alt allSuccessful
            CreateExperiment-->>Client: 201 Created (Experiment registered successfully)
        else at least one failed
            CreateExperiment-->>Client: 400 Bad Request (failure message)
        end
    else invalid KruizeObject
        CreateExperiment-->>Client: error response with validation errorCode and message
    end
Loading

Sequence diagram for deleteExperiment DELETE flow using ExperimentCache

sequenceDiagram
    actor Client
    participant CreateExperiment
    participant ExperimentCache
    participant ExperimentDAOImpl

    Client->>CreateExperiment: HTTP DELETE /createExperiment?rm=true|false
    CreateExperiment->>CreateExperiment: parse body to CreateExperimentAPIObject[]
    alt multiple experiments
        CreateExperiment-->>Client: 400 Bad Request (unsupported experiment)
    else single experiment
        loop for each CreateExperimentAPIObject
            CreateExperiment->>CreateExperiment: expName = getExperimentName()
            CreateExperiment->>ExperimentCache: isExists(expName)
            alt experiment not in cache and DB
                ExperimentCache-->>CreateExperiment: false
                CreateExperiment->>CreateExperiment: throw Exception("Experiment not found")
                CreateExperiment-->>Client: error response
            else experiment found
                ExperimentCache-->>CreateExperiment: true
                alt rmTable == true
                    CreateExperiment->>ExperimentDAOImpl: deleteKruizeExperimentEntryByName(expName)
                else rmTable == false
                    CreateExperiment->>ExperimentDAOImpl: deleteKruizeLMExperimentEntryByName(expName)
                end
                ExperimentDAOImpl-->>CreateExperiment: ValidationOutputData
                alt delete success
                    CreateExperiment->>ExperimentCache: remove(expName)
                else delete failed
                    CreateExperiment->>CreateExperiment: throw Exception("Experiment not deleted")
                    CreateExperiment-->>Client: error response
                end
            end
        end
        CreateExperiment-->>Client: 201 Created (Experiment deleted successfully)
    end
Loading

Class diagram for ExperimentCache integration in CreateExperiment

classDiagram
    class CreateExperiment {
        <<Servlet>>
        - Logger LOGGER
        - ExperimentCache experimentCache
        - Gson gson
        - ExperimentDBService experimentDBService
        + void doPost(HttpServletRequest request, HttpServletResponse response)
        + void doDelete(HttpServletRequest request, HttpServletResponse response)
        - void sendSuccessResponse(HttpServletResponse response, String message)
    }

    class ExperimentCache {
        - Logger LOGGER
        - HashSet~String~ experimentNamesCache
        - ExperimentDBService experimentDBService
        + boolean isExists(String experiment_name)
        + void add(String experiment_name)
        + void remove(String experiment_name)
    }

    class ExperimentDBService {
        + void loadExperimentFromDBByName(Map~String, KruizeObject~ experimentMap, String experimentName)
        + void loadLMExperimentFromDBByName(Map~String, KruizeObject~ experimentMap, String experimentName)
        + ValidationOutputData addExperimentToDB(CreateExperimentAPIObject apiObject)
    }

    class ExperimentDAOImpl {
        + ValidationOutputData deleteKruizeExperimentEntryByName(String experimentName)
        + ValidationOutputData deleteKruizeLMExperimentEntryByName(String experimentName)
    }

    class KruizeDeploymentInfo {
        + boolean is_ros_enabled
    }

    class CreateExperimentAPIObject {
        + String getExperimentName()
        + void setExperiment_id(String experimentId)
        + void setStatus(String status)
        + void setValidationData(ValidationOutputData validationData)
    }

    class KruizeObject {
        + String getExperimentName()
        + ValidationOutputData getValidation_data()
    }

    class ValidationOutputData {
        + boolean isSuccess()
        + String getMessage()
    }

    CreateExperiment --> ExperimentCache : uses
    CreateExperiment --> ExperimentDBService : uses
    CreateExperiment --> ExperimentDAOImpl : uses
    ExperimentCache --> ExperimentDBService : uses
    ExperimentCache --> KruizeDeploymentInfo : reads
    ExperimentDBService --> KruizeObject : loads
    CreateExperiment --> CreateExperimentAPIObject : parses
    CreateExperiment --> KruizeObject : validates
    CreateExperiment --> ValidationOutputData : handles
    ExperimentDAOImpl --> ValidationOutputData : returns
Loading

File-Level Changes

Change Details Files
Add ExperimentCache utility class for caching experiment existence with DB fallback.
  • Implement ExperimentCache with a synchronized HashSet-based name cache and an internal ExperimentDBService for cache-miss lookups.
  • On cache miss, query the appropriate experiments table depending on KruizeDeploymentInfo.is_ros_enabled and populate the cache if the experiment exists.
  • Expose synchronized add and remove methods to manage cached experiment names and log cache operations.
src/main/java/com/autotune/analyzer/utils/ExperimentCache.java
Wire ExperimentCache into CreateExperiment servlet for create and delete operations and reuse shared service instances.
  • Instantiate a single ExperimentCache, Gson, and ExperimentDBService as fields in CreateExperiment to avoid per-request allocations.
  • On experiment creation, check ExperimentCache before processing and return HTTP 409 if the experiment name already exists, skipping further work.
  • After successfully persisting experiments, add their names to ExperimentCache and only report overall success if all DB inserts are successful, otherwise return HTTP 400 with a safe fallback message when ValidationOutputData is null.
  • In delete flow, replace mKruizeExperimentMap/DB load logic with an ExperimentCache existence check, delete via ExperimentDAOImpl, and remove names from the cache on successful deletion.
  • Update JSON serialization in sendSuccessResponse and request parsing in doDelete to use the shared Gson instance instead of creating new ones.
src/main/java/com/autotune/analyzer/services/CreateExperiment.java

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

Copy link
Copy Markdown
Contributor

@sourcery-ai sourcery-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 2 issues, and left some high level feedback:

  • The ExperimentCache uses a plain HashSet with synchronized methods; consider switching to a concurrent set (e.g. Set<String> experimentNamesCache = Collections.newSetFromMap(new ConcurrentHashMap<>());) to avoid coarse-grained locking at the object level and simplify concurrency control.
  • In doDelete, the previous logic that loaded experiments into mKruizeExperimentMap and relied on that map for existence checks has been removed in favor of ExperimentCache.isExists; this changes behavior to be entirely cache/DB driven and bypasses mKruizeExperimentMap, so verify that any other code paths relying on mKruizeExperimentMap consistency (especially in the same JVM) are not affected or re-align them with the new cache approach.
  • Both CreateExperiment and ExperimentCache directly instantiate their own ExperimentDBService instances; if the service is intended to be shared or managed (e.g. for connection pooling, configuration, or testability), consider injecting it or using a shared singleton rather than creating new instances inside these classes.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- The `ExperimentCache` uses a plain `HashSet` with `synchronized` methods; consider switching to a concurrent set (e.g. `Set<String> experimentNamesCache = Collections.newSetFromMap(new ConcurrentHashMap<>());`) to avoid coarse-grained locking at the object level and simplify concurrency control.
- In `doDelete`, the previous logic that loaded experiments into `mKruizeExperimentMap` and relied on that map for existence checks has been removed in favor of `ExperimentCache.isExists`; this changes behavior to be entirely cache/DB driven and bypasses `mKruizeExperimentMap`, so verify that any other code paths relying on `mKruizeExperimentMap` consistency (especially in the same JVM) are not affected or re-align them with the new cache approach.
- Both `CreateExperiment` and `ExperimentCache` directly instantiate their own `ExperimentDBService` instances; if the service is intended to be shared or managed (e.g. for connection pooling, configuration, or testability), consider injecting it or using a shared singleton rather than creating new instances inside these classes.

## Individual Comments

### Comment 1
<location> `src/main/java/com/autotune/analyzer/services/CreateExperiment.java:216-217` </location>
<code_context>
-                        }
-                    } else
+                    
+                    // Use ExperimentCache to check if experiment exists (instead of loading from DB)
+                    if (!experimentCache.isExists(expName)) {
                         throw new Exception("Experiment not found!");
+                    }
</code_context>

<issue_to_address>
**issue (bug_risk):** Cache-backed existence check ignores `rmTable` and may consult a different backing table than the one being deleted from.

`rmTable` controls which table is used for deletion, but `ExperimentCache.isExists()` chooses the table based on `KruizeDeploymentInfo.is_ros_enabled`. If these can differ, the existence check may hit a different table than the delete, causing false "not found" errors or inconsistent behavior. Consider either passing the target table into the cache API or deriving the existence check from the same `rmTable` condition used for deletion.
</issue_to_address>

### Comment 2
<location> `src/main/java/com/autotune/analyzer/utils/ExperimentCache.java:47-49` </location>
<code_context>
+     * @param experiment_name the experiment name to check
+     * @return true if experiment exists, false otherwise
+     */
+    public synchronized boolean isExists(String experiment_name) {
+        if (experiment_name == null) {
+            return false;
+        }
+        
</code_context>

<issue_to_address>
**suggestion (performance):** Synchronized `isExists` wraps both cache access and DB I/O, which can unnecessarily serialize requests and limit throughput.

Because `isExists` is `synchronized` and may perform a blocking DB call, all callers are serialized on a busy servlet. Consider using a thread-safe set (e.g., `ConcurrentHashMap.newKeySet()`) and synchronizing only the small section that mutates the cache, or doing the DB lookup outside the lock and synchronizing only the cache update. This keeps correctness while reducing lock contention and improving concurrency.

Suggested implementation:

```java
import java.util.HashSet;
import java.util.Map;
import java.util.Set;
import java.util.concurrent.ConcurrentHashMap;

```

```java
    // Use a concurrent set to avoid coarse-grained synchronization on read paths
    private final Set<String> experimentNamesCache = ConcurrentHashMap.newKeySet();

```

```java
    /**
     * Check if an experiment exists in cache or database.
     *
     * Fast path:
     *   - Null check
     *   - Concurrent cache lookup
     * Slow path:
     *   - DB lookup (potentially blocking)
     *   - On positive hit, cache the experiment name
     *
     * This avoids serializing callers on a single monitor while still keeping the cache consistent.
     *
     * @param experiment_name the experiment name to check
     * @return true if experiment exists, false otherwise
     */

```

```java
    public boolean isExists(String experiment_name) {
        // Fast null check
        if (experiment_name == null) {
            return false;
        }

        // Fast-path cache lookup (non-blocking, thread-safe)
        if (experimentNamesCache.contains(experiment_name)) {
            return true;
        }

        // Slow-path DB lookup outside any lock to avoid serializing callers
        boolean existsInDb = experimentDBService.isExists(experiment_name);

        // If found in DB, cache it for future calls
        if (existsInDb) {
            experimentNamesCache.add(experiment_name);
            return true;
        }

        return false;

```

1. If `ExperimentDBService` does not expose a `boolean isExists(String experimentName)` API, replace the call
   `experimentDBService.isExists(experiment_name)` with the appropriate existing method (e.g., `getExperiment(...) != null`).
2. Once the migration to `Set<String> experimentNamesCache = ConcurrentHashMap.newKeySet()` is complete and no other code
   depends on `HashSet` specifically, you can remove the now-unused `import java.util.HashSet;` line.
3. If there are other methods that manually synchronize on `experimentNamesCache` (e.g., `synchronized` blocks or methods),
   you can likely remove those coarse locks and rely on the concurrent set, keeping only minimal synchronization if compound
   operations require atomicity beyond a single `add`/`contains`.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment thread src/main/java/com/autotune/analyzer/services/CreateExperiment.java Outdated
Comment thread src/main/java/com/autotune/analyzer/utils/ExperimentCache.java Outdated

This comment was marked as duplicate.

@ncau ncau requested a review from mbvreddy January 30, 2026 13:22
@rbadagandi1 rbadagandi1 added this to the Kruize 0.9 Release milestone Feb 2, 2026
@rbadagandi1 rbadagandi1 moved this to In Progress in Monitoring Feb 2, 2026
pinkygupta-hub and others added 4 commits February 2, 2026 14:11
Signed-off-by: Pinky Gupta <pinky.gupta1@ibm.com>
Signed-off-by: Pinky Gupta <pinky.gupta1@ibm.com>
…verhead

Signed-off-by: Bhaktavatsal Reddy <bhamaram@in.ibm.com>
Signed-off-by: Bhaktavatsal Reddy <bhamaram@in.ibm.com>
chandrams and others added 22 commits March 3, 2026 12:44
…lt wait time

Signed-off-by: Chandrakala Subramanyam <csubrama@redhat.com>
Signed-off-by: Chandrakala Subramanyam <csubrama@redhat.com>
Signed-off-by: Chandrakala Subramanyam <csubrama@redhat.com>
…e test

Signed-off-by: Chandrakala Subramanyam <csubrama@redhat.com>
Signed-off-by: bharathappali <abharath@redhat.com>
Signed-off-by: Chandrakala Subramanyam <csubrama@redhat.com>
Updated RH UBI minimal version 10.1 to latest available
Updated kruize demos test to include runtimes demo
Signed-off-by: Saad Khan <saakhan@ibm.com>
Signed-off-by: Bhaktavatsal Reddy <bhamaram@in.ibm.com>
Removed unused files in scale test and fixed repo path
KRUIZE-1023 : Implement cache for performance profiles to reduce DB overhead
adding stale bot for workflow
Included dependabot yml file to update dependencies and github actions
Onboarding Unit Testing with Mocking Framework for Kruize
Signed-off-by: Bhaktavatsal Reddy <bhamaram@in.ibm.com>
Signed-off-by: Saad Khan <saakhan@ibm.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

9 participants