ardoco · DanielDango · Dec 23, 2025 · Jan 7, 2026 · Jan 7, 2026 · Jan 20, 2026
@@ -8,6 +8,39 @@ This also enables us to quantify the importance of well designed prompts in the
 
 ## Core Components
 
+### Overview of Prompt Optimization Subcomponents
+
+The table below provides a brief overview of the subcomponents used in the prompt optimization module.
+
+|       Component        |        **SampleStrategy**        |                  **Selector**                  |           **Metric**           |
+|------------------------|----------------------------------|------------------------------------------------|--------------------------------|
+| **Location**           | `promptoptimizer.samplestrategy` | `promptoptimizer.promptselector`               | `promptoptimizer.promptmetric` |
+| **Purpose**            | Select items from a collection   | Orchestrate prompt evaluation with budget      | Calculate performance scores   |
+| **Answers**            | "Which generic items to use?"    | "Which prompts to test when?"                  | "How good is this prompt?"     |
+| **Method**             | `sample(items, sampleSize)`      | `selectAndEvaluate(prompts, examples, metric)` | `getMetric(prompts, examples)` |
+| **Algorithm Examples** | First/Ordered/Shuffled           | Simple/UCB Bandit                              | Pointwise/FBeta                |
+
+### Sample Strategies (`samplestrategy` package)
+
+A [`SampleStrategy`](../src/main/java/edu/kit/kastel/sdq/lissa/ratlr/promptoptimizer/samplestrategy/SampleStrategy.java) determines how to select a subset of items from a collection.
+These strategies are used throughout the optimization process to sample items when the full set would be too large or expensive to process.
+The key method `sample(items, sampleSize)` returns a list of selected items based on the strategy's selection logic.
+In practice items may be classification examples, candidate prompts, or simple identifiers depending on the context in which the sampler is used.
+
+Custom sample strategies can be added by implementing the [`SampleStrategy`](../src/main/java/edu/kit/kastel/sdq/lissa/ratlr/promptoptimizer/samplestrategy/SampleStrategy.java) interface and integrating them via the static factory method `SampleStrategy.createSampler(...)` defined there.
+
+#### Available Sample Strategies
+
+- **[`First Sampler`](../src/main/java/edu/kit/kastel/sdq/lissa/ratlr/promptoptimizer/samplestrategy/FirstSampler.java)** (`first`):
+  Selects the first n items from the collection without any modification.
+  Maintains the original order of items.
+- **[`Ordered First Sampler`](../src/main/java/edu/kit/kastel/sdq/lissa/ratlr/promptoptimizer/samplestrategy/OrderedFirstSampler.java)** (`ordered`):
+  Sorts items before selecting the first n items.
+  Ensures deterministic sampling based on the natural ordering of items.
+- **[`Shuffled First Sampler`](../src/main/java/edu/kit/kastel/sdq/lissa/ratlr/promptoptimizer/samplestrategy/ShuffledFirstSampler.java)** (`shuffled`):
+  Randomly shuffles items before selecting the first n items.
+  Provides random sampling with reproducibility through seeded random number generation.
+
 ### Prompt Metrics (`promptmetric` package)
 
 A [`Metric`](../src/main/java/edu/kit/kastel/sdq/lissa/ratlr/promptoptimizer/promptmetric/Metric.java) is a numeric measure used to evaluate the quality of prompts during the optimization process.
@@ -30,6 +63,29 @@ Custom metrics can be added either through implementation of the [`Global Metric
     - Mean
 - **[`Mock Metric`](../src/main/java/edu/kit/kastel/sdq/lissa/ratlr/promptoptimizer/promptmetric/MockMetric.java)** (`mock`): Returns dummy values for testing purposes
 
+### Selectors (`promptselector` package)
+
+A [`Selector`](../src/main/java/edu/kit/kastel/sdq/lissa/ratlr/promptoptimizer/promptselector/Selector.java) orchestrates the evaluation of multiple prompts within a given evaluation budget.
+They determine which prompts to test and when, managing the trade-off between exploration (testing new prompts) and exploitation (focusing on promising prompts).
+Selectors use the `selectAndEvaluate` method to coordinate prompt evaluation, calling the metric to score prompts against classification examples while respecting budget constraints.
+
+The exact evaluation budget parameters are selector-specific, controlling how many total evaluations can be performed.
+This budget management is crucial for expensive LLM-based evaluations.
+
+Custom selectors can be added by implementing the [`Selector`](../src/main/java/edu/kit/kastel/sdq/lissa/ratlr/promptoptimizer/promptselector/Selector.java) interface.
+
+#### Available Selectors
+
+- **[`Simple Selector`](../src/main/java/edu/kit/kastel/sdq/lissa/ratlr/promptoptimizer/promptselector/SimpleSelector.java)** (`simple`):
+  Evaluates all provided candidate prompts against a subset of examples.
+  The sample size is determined by dividing the evaluation budget by the number of prompts.
+  Examples are shuffled randomly before selection to ensure diverse evaluation.
+
+- **[`Upper Confidence Bound Bandit Selector`](../src/main/java/edu/kit/kastel/sdq/lissa/ratlr/promptoptimizer/promptselector/UpperConfidenceBoundBanditSelector.java)** (`ucb`):
+  Implements a multi-armed bandit approach using the UCB (Upper Confidence Bound) algorithm.
+  Balances exploration and exploitation by selecting prompts based on both their current performance and uncertainty.
+  More efficient than simple selection when evaluating many prompts, as it focuses on promising candidates.
+
 ### Optimizers (`promptoptimizer` package)
 
 The [`Optimizer`](../src/main/java/edu/kit/kastel/sdq/lissa/ratlr/promptoptimizer/PromptOptimizer.java) module handles prompt optimization requests.
@@ -56,6 +112,21 @@ Custom optimizers can be added by implementing the [`Prompt Optimizer`](../src/m
   In each iteration, it queries the model with an additional feedback text on the current prompt.
   The optimizer carries the optimized prompt to the next iteration naively.
   Trace links that were incorrectly classified in previous iterations are highlighted in the feedback text to guide the model towards better performance.
+- **[`ProTeGi Optimizer`](../src/main/java/edu/kit/kastel/sdq/lissa/ratlr/promptoptimizer/ProTeGiOptimizer.java)** (`protegi`):
+  An advanced optimizer based on textual gradient descent for large language models, following the approach by Pryzant et al. (2023).
+  Uses textual gradients derived from error analysis to systematically refine prompts.
+  In each iteration:
+  1. **Candidate Expansion**: Generates multiple candidate prompt variations
+     - Analyzes why the current prompt misclassifies examples (textual gradients)
+     - Creates transformations based on these error patterns
+     - Generates synonym variations to explore the prompt space
+  2. **Candidate Evaluation**: Uses the configured selector and metric to evaluate all candidate prompts
+     - Selector decides which candidate prompts to test and on how many examples (budget-aware)
+     - Metric scores each candidate prompt's performance
+  3. **Best Selection**: Selects the top-performing candidate prompts (beam size) for the next iteration
+
+  Example flow: Current prompt gets accuracy 70% → generates 20 candidates → evaluates them with limited budget → selects top 4 for next iteration
+
 - **[`Mock Optimizer`](../src/main/java/edu/kit/kastel/sdq/lissa/ratlr/promptoptimizer/MockOptimizer.java)** (`mock`): Returns dummy optimized prompts for testing purposes
 
 ## Configuration

@@ -0,0 +1,82 @@
+
+{
+  "cache_dir": "./cache/WARC",
+
+  "gold_standard_configuration": {
+    "path": "./datasets/req2req/WARC/answer.csv",
+    "hasHeader": "true"
+  },
+
+  "source_artifact_provider" : {
+    "name" : "text",
+    "args" : {
+      "artifact_type" : "requirement",
+      "path" : "./datasets/req2req/WARC/high"
+    }
+  },
+  "target_artifact_provider" : {
+    "name" : "text",
+    "args" : {
+      "artifact_type" : "requirement",
+      "path" : "./datasets/req2req/WARC/low"
+    }
+  },
+  "source_preprocessor" : {
+    "name" : "artifact",
+    "args" : {}
+  },
+  "target_preprocessor" : {
+    "name" : "artifact",
+    "args" : {}
+  },
+  "embedding_creator" : {
+    "name" : "openai",
+    "args" : {
+      "model": "text-embedding-3-large"
+    }
+  },
+  "source_store" : {
+    "name" : "custom",
+    "args" : {}
+  },
+  "target_store" : {
+    "name" : "cosine_similarity",
+    "args" : {
+      "max_results" : "4"
+    }
+  },
+  "metric" : {
+    "name" : "pointwise",
+    "args" : {}
+  },
+  "selector" : {
+    "name" : "ucb",
+    "args" : {
+      "samples_per_eval" : "16"
+    }
+  },
+  "prompt_optimizer": {
+    "name" : "gradient_openai",
+    "args" : {
+      "prompt": "Question: Here are two parts of software development artifacts.\n\n            {source_type}: '''{source_content}'''\n\n            {target_type}: '''{target_content}'''\n            Are they related?\n\n            Answer with 'yes' or 'no'.",
+      "model": "gpt-4o-mini-2024-07-18",
+      "maximum_iterations": 3,
+      "minibatch_size" : "20"
+    }
+  },
+  "classifier" : {
+    "name" : "simple_openai",
+    "args" : {
+      "model": "gpt-4o-mini-2024-07-18",
+      "temperature": 0.0
+    }
+  },
+  "result_aggregator" : {
+    "name" : "any_connection",
+    "args" : {}
+  },
+  "tracelinkid_postprocessor" : {
+    "name" : "identity",
+    "args" : {}
+  }
+}
@@ -77,13 +77,16 @@ public void run() {
         }
 
         for (Path optimizationConfig : configsToOptimize) {
-            String optimizedPrompt = runOptimization(optimizationConfig);
-            if (optimizedPrompt.isEmpty()) {
+            List<String> optimizedPrompts = runOptimizations(optimizationConfig);
+            if (optimizedPrompts.isEmpty()) {
                 logger.warn(
-                        "Skipping evaluation for optimization config '{}' as no optimized prompt was generated.",
+                        "Skipping evaluation for optimization config '{}' as no optimized prompt was generated. "
+                                + "This can happen when the optimizer terminates early (e.g., due to configuration such "
+                                + "as zero iterations) or when a mock optimizer is used.",
                         optimizationConfig);
                 continue;
             }
+            String optimizedPrompt = optimizedPrompts.getLast();
             for (Path evaluationConfig : configsToEvaluate) {
                 runEvaluation(evaluationConfig, optimizedPrompt);
             }
@@ -94,21 +97,21 @@ public void run() {
      * Runs the optimization pipeline using the specified configuration file.
      *
      * @param optimizationConfig The path to the optimization configuration file
-     * @return The optimized prompt generated by the optimization pipeline
+     * @return The optimized prompts generated by the optimization pipeline
      */
-    private static String runOptimization(Path optimizationConfig) {
+    private static List<String> runOptimizations(Path optimizationConfig) {
         logger.info("Invoking the optimization pipeline with '{}'", optimizationConfig);
-        String optimizedPrompt = "";
+        List<String> optimizedPrompts = List.of();
         try {
             var optimization = new Optimization(optimizationConfig);
-            optimizedPrompt = optimization.run();
+            optimizedPrompts = optimization.run();
         } catch (IOException e) {
             logger.warn(
                     "Optimization configuration '{}' threw an exception: {} \n Maybe the file does not exist?",
                     optimizationConfig,
                     e.getMessage());
         }
-        return optimizedPrompt;
+        return optimizedPrompts;
     }
 
     private static void runEvaluation(Path evaluationConfig, String optimizedPrompt) {

@@ -5,6 +5,7 @@
 
 import java.io.IOException;
 import java.nio.file.Path;
+import java.util.List;
 import java.util.Objects;
 import java.util.Set;
 
@@ -18,6 +19,7 @@
 import edu.kit.kastel.sdq.lissa.ratlr.knowledge.TraceLink;
 import edu.kit.kastel.sdq.lissa.ratlr.promptoptimizer.PromptOptimizer;
 import edu.kit.kastel.sdq.lissa.ratlr.promptoptimizer.promptmetric.Metric;
+import edu.kit.kastel.sdq.lissa.ratlr.promptoptimizer.promptselector.Selector;
 
 /**
  * Represents a single prompt optimization run of the LiSSA framework.
@@ -68,7 +70,7 @@ public Optimization(Path configFile) throws IOException {
      * <ol>
      *     <li>Loads the configuration from the specified file</li>
      *     <li>Initializes the evaluation pipeline</li>
-     *     <li>Creates the Metric, Evaluator and Optimizer</li>
+     *     <li>Creates the Metric, Selector and Optimizer</li>
      * </ol>
      *
      * @throws IOException If there are issues reading the configuration
@@ -84,8 +86,13 @@ private void setup() throws IOException {
                 evaluationPipeline.getClassifier(),
                 evaluationPipeline.getAggregator(),
                 evaluationPipeline.getTraceLinkIdPostProcessor());
+        Selector selector = null;
+        if (configuration.selector() != null) {
+            selector = Selector.createSelector(configuration.selector());
+        }
 
-        promptOptimizer = PromptOptimizer.createOptimizer(configuration.promptOptimizer(), goldStandard, metric);
+        promptOptimizer =
+                PromptOptimizer.createOptimizer(configuration.promptOptimizer(), goldStandard, metric, selector);
         configuration.serializeAndDestroyConfiguration();
     }
 
@@ -95,24 +102,32 @@ private void setup() throws IOException {
      * <ol>
      *     <li>Sets up the source and target stores</li>
      *     <li>Optimizes the prompt using the configured optimizer</li>
-     *     <li>Generates and saves optimization statistics</li>
+     *     <li>Generates and saves optimization statistics for the final prompt</li>
      *     <li>Flushes the cache to persist changes</li>
      * </ol>
      *
-     * @return The optimized prompt as a String
+     * @return A list of prompts representing the optimization state at each iteration,
+     *         where the last element is the final optimized prompt
      */
-    public String run() {
+    public List<String> run() {
         evaluationPipeline.initializeSourceAndTargetStores();
 
         logger.info("Optimizing Prompt");
-        String result =
+
+        List<String> results =
                 promptOptimizer.optimize(evaluationPipeline.getSourceStore(), evaluationPipeline.getTargetStore());
-        logger.info("Optimized Prompt: {}", result);
 
-        Statistics.generateOptimizationStatistics(configFile.toFile(), configuration, result);
+        if (results.isEmpty()) {
+            logger.warn("No optimized prompt was generated. Make sure maximum_iterations is set to greater than zero.");
+            return results;
+        }
+
+        Statistics.generateOptimizationStatistics(configFile.toFile(), configuration, results.getLast());
+
+        logger.info("Optimized prompt after {} steps: \n {}", results.size(), results.getLast());
 
         CacheManager.getDefaultInstance().flush();
 
-        return result;
+        return results;
     }
 }