feat: switch solver to index-based representation with optimized first-guess strategy by FlysonBot · Pull Request #56 · FlysonBot/Mastermind

FlysonBot · 2026-03-05T04:21:29Z

OVERVIEW

Refactors the core Mastermind solver to operate on base‑c code indices instead of decimal code integers, updating feedback, solution space, sampling, and tests accordingly.
-Introduces an optimized first-guess strategy with precomputed best opening moves per (c, d), plus calibration tooling and benchmarks for validating performance and accuracy.
-Reworks canonical form counting and enumeration around integer partitions, simplifying their use for symmetry reduction and aligning tests with the new index representation.
Adds incremental feedback and expected-rank computations for sequential secret scanning, improving performance of the first filter and full-space evaluations.
Simplifies code generation and sampling utilities to work directly on index ranges and BitSet-backed solution spaces, including new strategies for sampling valid guesses/secrets under size constraints.

Summary by Sourcery

Switch the solver to operate on base-c code indices instead of decimal code integers, introducing incremental feedback and an optimized first-guess strategy, and updating supporting utilities, benchmarks, and tests accordingly.

New Features:

Add BestFirstGuess to provide precomputed optimal opening guesses per (c, d) with calibration tooling and true-rank computation.
Introduce FeedbackIncremental for efficient sequential feedback computation over the full secret space.
Add GuessStrategy to adaptively select guess/secret subsets based on search-space size thresholds.
Provide ConvertCode utilities for converting between human-readable codes and index-based representation.

Enhancements:

Refactor SolutionSpace, Feedback, BestGuess, ExpectedSize, and related tests to use index-based representations and BitSet-backed spaces sized to c^d.
Optimize the first filter and full-space expected-rank calculations using incremental feedback over contiguous index ranges.
Rework CanonicalCode counting and enumeration around integer partitions of d with at most c parts and adjust tests to match the index-based canonical representatives.
Simplify AllValidCode and SampledCode to generate and sample over index ranges, including efficient valid-index sampling with enumeration vs rejection paths.
Update Demo and MastermindSession to use index-based guesses/secrets, integrate the new first-guess strategy, and improve logging of expected vs actual eliminations.
Refine JMH benchmarks for SolutionSpace, ExpectedSize, BestGuess, Feedback, and sampling to reflect the new index representation and incremental algorithms.
Update CLAUDE.md to reflect the solver’s current performance and project focus.

Documentation:

Adjust CanonicalCode tests and comments to describe integer-partition-based canonical forms and index-based representations.

Tests:

Update and expand tests for feedback correctness, expected size symmetry, solution space filtering, best-guess selection, sampling behavior, canonical form enumeration, and full-game sessions under the new index-based model.

Chores:

Remove unused workflows and legacy classes such as CodeCache and the old GuessStrategy and benchmark implementations tied to decimal code representations.

… ConvertCode utils for index and code conversion

…Strategy

- Use ConvertCode.toCode(i) to avoid allValid[i] query - Use Math.pow(c, d) instead of allValid.length - Reduce allValid usage unless necessary - Use straight call to CanonicalCode without going through CodeCache - Removed cache for CanonicalCode in CodeCache

- getFeedback now take index and extract digit directly, so its caller doesn't have to waste time generate code before getFeedback can be called - This change propagated through all methods

…ymmetry Now CanonicalCode can cut down the guess space for 9x9 Mastermind from 9^9 to 30.

…sses through either brute-force or sampling

- BestFirstGuess.of() now returns long[]{code, rank} instead of int - Hardcoded table updated with precomputed true ranks for all game sizes - MastermindSession uses BestFirstGuess.of() directly for the first move - GuessStrategy.select() drops the `turn` parameter (first-move logic moved out) - Rename laterTurns → selectSearchSpace for clarity

- Add FeedbackIncremental class with incremental feedback computation - Add calcExpectedRankFirst in ExpectedSize using FeedbackIncremental - Add filterRangeFirst in SolutionSpace using FeedbackIncremental for first filter - Update BestFirstGuess to use calcExpectedRankFirst

- Explain counter algorithm for white-peg computation in Feedback - Clarify GuessStrategy tolerance/percentile sampling heuristics - Add minor inline comments in various places - Update CLAUDE.md status and current focus section

sourcery-ai

Hey - I've found 9 issues, and left some high level feedback:

There are multiple places where Math.pow(c, d) is recomputed and cast to int (e.g., SolutionSpace, GuessStrategy, SampledCode, ExpectedSizeBenchmark); consider centralizing the c^d computation (using an integer loop or a precomputed totalCodes passed around) to avoid repeated double math and make it easier to change the supported range of c/d later.
In hot paths that use incremental feedback (e.g., SolutionSpace.filterRangeFirst and similar benchmark code), you allocate temporary arrays like int[] result = new int[3] inside tight loops; it would be more efficient to allocate these once per task/thread and reuse them to reduce GC pressure during large-space filtering.

Prompt for AI Agents

Please address the comments from this code review:

## Overall Comments
- There are multiple places where `Math.pow(c, d)` is recomputed and cast to `int` (e.g., `SolutionSpace`, `GuessStrategy`, `SampledCode`, `ExpectedSizeBenchmark`); consider centralizing the `c^d` computation (using an integer loop or a precomputed `totalCodes` passed around) to avoid repeated double math and make it easier to change the supported range of `c`/`d` later.
- In hot paths that use incremental feedback (e.g., `SolutionSpace.filterRangeFirst` and similar benchmark code), you allocate temporary arrays like `int[] result = new int[3]` inside tight loops; it would be more efficient to allocate these once per task/thread and reuse them to reduce GC pressure during large-space filtering.

## Individual Comments

### Comment 1
<location path="src/main/java/org/mastermind/codes/SampledCode.java" line_range="45-30" />
<code_context>
+     */
+    static final int MAX_ENUM = 5_000_000;
+
+    public static int[] getValidSample(BitSet remaining, int validCount, int c, int d, int sampleSize) {
+        int    total  = (int) Math.pow(c, d);
+        int[]  sample = new int[sampleSize];
+        Random random = new Random();
+
+        if (validCount <= MAX_ENUM) {
+            // Enumeration: bounded memory (≤20MB), fast scan, fast random access.
+            int[] valid = new int[validCount];
+            int   j     = 0;
+            for (int i = remaining.nextSetBit(0); i >= 0; i = remaining.nextSetBit(i + 1)) {
+                valid[j++] = i;
+            }
+            for (int i = 0; i < sampleSize; i++) {
+                sample[i] = valid[random.nextInt(validCount)];
+            }
</code_context>
<issue_to_address>
**suggestion (performance):** Repeatedly creating Random instances in getValidSample can be avoided for better throughput.

Both branches of `getValidSample` create a new `Random` on each call, which adds avoidable allocation and RNG contention for sampling-heavy workloads. Prefer `ThreadLocalRandom.current()` or a shared per-thread `Random`/`SplittableRandom` to reduce overhead while preserving behavior. The same consideration applies to `getSample` above.
</issue_to_address>

### Comment 2
<location path="src/tests/java/org/mastermind/codes/SampledCodeTest.java" line_range="27-26" />
<code_context>
+            assertEquals(30, CanonicalCode.countCanonicalForms(9, 9));
         }

         @ParameterizedTest
-        @DisplayName("Small known values for Stirling sum")
+        @DisplayName("Small known values (integer partitions of d with at most c parts)")
</code_context>
<issue_to_address>
**suggestion (testing):** Add tests for getValidSample to cover both enumeration and rejection sampling paths

The new `SampledCode.getValidSample(BitSet remaining, int validCount, int c, int d, int sampleSize)` method is core to the sampling logic but currently untested. Please add focused unit tests that:

- Use small, easily understood `BitSet` instances (e.g., `total = c^d <= 100`) and verify that returned indices are in range and correspond only to set bits.
- Exercise both code paths by choosing cases where `validCount <= MAX_ENUM` (enumeration) and `validCount > MAX_ENUM` (rejection), and confirm they still only yield valid indices.
- Cover the edge case where `sampleSize > validCount` and assert or document the intended behavior (e.g., duplicates vs. error).

These tests will help ensure both sampling paths remain correct and guard against regressions if `MAX_ENUM` or the sampling strategy changes.
</issue_to_address>

### Comment 3
<location path="src/tests/java/org/mastermind/solver/ExpectedSizeTest.java" line_range="12" />
<code_context>

     private static final int          COLORS          = 6;
     private static final int          DIGITS          = 4;
+    private static final int          TOTAL           = 1296; // 6^4
     private static final float        DELTA           = 0.001f;
     private static final int[]        feedbackFreq    = new int[100];
</code_context>
<issue_to_address>
**suggestion (testing):** Add a test that validates calcExpectedRankFirst against calcExpectedRank over the full secret space

The new `ExpectedSize.calcExpectedRankFirst` path isn’t currently covered by tests. Please add a test that, for several guesses (e.g., `1122`, `1123`, `1234`), computes:

- `rankFull = calcExpectedRank(guessInd, secretsInd /* 0..TOTAL-1 */, COLORS, DIGITS, feedbackFreq)`
- `rankInc = calcExpectedRankFirst(guessInd, COLORS, DIGITS, TOTAL, feedbackFreq)`

and asserts `rankFull == rankInc` for each guess to validate the incremental implementation matches the original behavior.

Suggested implementation:

```java
import org.junit.jupiter.api.Test;
import org.mastermind.codes.ConvertCode;

import java.util.Arrays;

import static org.junit.jupiter.api.Assertions.*;

```

```java
    private static final int          COLORS          = 6;
    private static final int          DIGITS          = 4;
    private static final int          TOTAL           = 1296; // 6^4
    private static final float        DELTA           = 0.001f;
    private static final int[]        feedbackFreq    = new int[100];
    private static final ExpectedSize expectedSizeObj = new ExpectedSize(DIGITS);

    // All secret indices 0..TOTAL-1
    private static final int[] secretsInd;

    static {
        secretsInd = new int[TOTAL];
        for (int i = 0; i < TOTAL; i++) {
            secretsInd[i] = i;
        }
    }

    /**
     * Converts a code string like "1122" into the corresponding internal index in [0, TOTAL).
     * Assumes digits are in '1'..'0' + COLORS and uses base-COLORS positional encoding.
     */
    private static int codeToIndex(String code) {
        if (code.length() != DIGITS) {
            throw new IllegalArgumentException("Code length must be " + DIGITS + ": " + code);
        }

        int index = 0;
        for (int i = 0; i < code.length(); i++) {
            char c = code.charAt(i);
            if (c < '1' || c >= '1' + COLORS) {
                throw new IllegalArgumentException("Digit out of range for COLORS=" + COLORS + ": " + c);
            }
            int digit = c - '1'; // map '1'.. to 0..
            index = index * COLORS + digit;
        }
        return index;
    }

    @Test
    void calcExpectedRankFirst_matches_full_expected_rank_for_sample_guesses() {
        int[] guesses = new int[] {
                codeToIndex("1122"),
                codeToIndex("1123"),
                codeToIndex("1234")
        };

        for (int guessInd : guesses) {
            // Compute expected rank over the full secret space
            Arrays.fill(feedbackFreq, 0);
            float rankFull = expectedSizeObj.calcExpectedRank(
                    guessInd,
                    secretsInd,
                    COLORS,
                    DIGITS,
                    feedbackFreq
            );

            // Compute expected rank using the incremental implementation
            Arrays.fill(feedbackFreq, 0);
            float rankInc = expectedSizeObj.calcExpectedRankFirst(
                    guessInd,
                    COLORS,
                    DIGITS,
                    TOTAL,
                    feedbackFreq
            );

            assertEquals(rankFull, rankInc, DELTA,
                    "calcExpectedRankFirst should match calcExpectedRank for guess index " + guessInd);
        }
    }

```

This implementation assumes:

1. `ExpectedSize.calcExpectedRank` and `ExpectedSize.calcExpectedRankFirst` are instance methods on `expectedSizeObj` with the signatures:
   - `float calcExpectedRank(int guessInd, int[] secretsInd, int colors, int digits, int[] feedbackFreq)`
   - `float calcExpectedRankFirst(int guessInd, int colors, int digits, int total, int[] feedbackFreq)`

2. Indices for codes are generated via a base-`COLORS` encoding with digits `'1'..('1'+COLORS-1)` mapped to `0..COLORS-1`.  
   If your existing `ConvertCode` utility already provides the canonical mapping between code strings and indices, you may want to:
   - Replace `codeToIndex` with a call into `ConvertCode` to avoid duplicating logic, and
   - Remove or adapt the range checks in `codeToIndex` accordingly.

3. If `calcExpectedRankFirst` or `calcExpectedRank` are `static` instead of instance methods, update the calls in the new test to call them statically (e.g., `ExpectedSize.calcExpectedRank(...)`) and remove `expectedSizeObj` if it becomes unused.

Ensure the test class declaration (`public class ExpectedSizeTest { ... }`) encloses the new members and that your existing formatting and naming conventions are preserved.
</issue_to_address>

### Comment 4
<location path="src/tests/java/org/mastermind/solver/SolutionSpaceTest.java" line_range="16-17" />
<code_context>
      * Simulate a full game with secret 1234 (a typical canonical starting case).
      * The solver must finish within MAX_TURNS turns.
      */
     @Test
     void testSolveSecret1234() {
-        runGame(1234);
</code_context>
<issue_to_address>
**suggestion (testing):** Extend SolutionSpace tests to cover multiple filters and the specialized first-filter path

The new `SolutionSpace` uses an `isFirstFilter` flag and a dedicated `filterRangeFirst` path only on the first call, then falls back to `filterRange`. The current test exercises only the first call.

To fully cover this behavior, add a test that:
1. Applies an initial filter and records the resulting secrets/size.
2. Applies a second filter with a different guess/feedback.
3. Builds a reference BitSet by iterating 0..TOTAL-1, applying `Feedback.getFeedback` for both guesses, and comparing:
   - `space.getSize()` to the number of bits set in the reference.
   - `space.getSecrets()` to the indices set in the reference.

This verifies both the correctness of the first incremental filter and that subsequent filters use the standard path without corrupting the BitSet or misusing the flag.
</issue_to_address>

### Comment 5
<location path="src/tests/java/org/mastermind/MastermindSessionTest.java" line_range="22-23" />
<code_context>
      * Simulate a full game with secret 1234 (a typical canonical starting case).
      * The solver must finish within MAX_TURNS turns.
      */
     @Test
     void testSolveSecret1234() {
-        runGame(1234);
+        runGame(ind(1234));
</code_context>
<issue_to_address>
**suggestion (testing):** Add coverage for BestFirstGuess / GuessStrategy via suggestGuessWithDetails on the first turn

These tests don’t exercise the new first-guess and selection logic that the game flow now depends on.

1. Add a test that calls `session.suggestGuessWithDetails()` on an empty history (e.g., c=6, d=4) and asserts that:
   - The guess index matches the expected best opening (e.g., `1123` via `ConvertCode.toCode`).
   - The reported rank matches the precomputed `BestFirstGuess` value.
   - The reported secrets length equals the full solution-space size (c^d).

2. Add a unit test for `GuessStrategy.select` with a larger (c, d) where `guesses × secrets` exceeds the threshold, asserting that:
   - It returns sampled secrets/guesses with sizes matching the sampling helpers.
   - All returned indices are valid within the current `SolutionSpace`.

This will ensure the first-guess table and adaptive selection are properly covered by tests.

Suggested implementation:

```java
    /** Simulate a full game with secret 6666 (all same color, worst-case candidate). */
    @Test
    void testSolveSecret6666() {
        runGame(ind(6666));
    }

    /**
     * On an empty history for (C=6, D=4), the first suggested guess should:
     *  - match the precomputed BestFirstGuess opening index
     *  - have the expected best-first rank
     *  - see the full solution space (C^D) as possible secrets
     */
    @Test
    void testSuggestGuessWithDetailsOnEmptyHistory() {
        final int c = 6;
        final int d = 4;

        // Full solution space size for (c, d)
        final int expectedSecretsSize = (int) Math.pow(c, d);

        // Build a fresh session with no history
        MastermindSession session = new MastermindSession(c, d);

        // Ask the session for the first guess with details
        SuggestionDetails details = session.suggestGuessWithDetails();

        // Best-first-guess table should contain the canonical opening for (c, d)
        BestFirstGuess.Entry bestFirst = BestFirstGuess.forParameters(c, d);

        // The suggested guess index must match the table's best opening
        assertEquals(bestFirst.index(), details.guessIndex(),
                "First suggested guess index should match BestFirstGuess table");

        // Rank reported by the session should match the precomputed rank
        assertEquals(bestFirst.rank(), details.rank(),
                "Reported rank should match BestFirstGuess precomputed rank");

        // The number of candidate secrets on the first move should be the whole space
        assertEquals(expectedSecretsSize, details.secretsCount(),
                "First move secrets count should equal full solution-space size");

        // Optionally ensure the decoded code matches the known best opening (e.g. 1123)
        int[] guessCode = ConvertCode.toCode(c, d, details.guessIndex());
        int[] expectedCode = ConvertCode.toCode(c, d, bestFirst.index());
        assertArrayEquals(expectedCode, guessCode,
                "First guess code should equal the canonical best opening for (6,4)");
    }

    /**
     * For larger (C, D) where guesses×secrets exceeds the sampling threshold,
     * GuessStrategy.select must return sampled secrets/guesses, with sizes
     * matching the sampling helpers and all indices valid within the SolutionSpace.
     */
    @Test
    void testGuessStrategySelectSampling() {
        // Choose parameters large enough that guesses×secrets will exceed sampling threshold.
        // Adjust these if the threshold changes in GuessStrategy.
        final int c = 8;
        final int d = 5;

        SolutionSpace space = new SolutionSpace(c, d);

        int totalSecrets = space.size();

        // Use all possible codes as candidate guesses to guarantee a large product.
        int[] allGuessIndices = new int[totalSecrets];
        for (int i = 0; i < totalSecrets; i++) {
            allGuessIndices[i] = i;
        }

        // Compute expected sample sizes using the same helpers that GuessStrategy uses.
        int expectedSampledSecrets = GuessStrategy.sampleSecretsCount(totalSecrets, allGuessIndices.length);
        int expectedSampledGuesses = GuessStrategy.sampleGuessesCount(totalSecrets, allGuessIndices.length);

        GuessSelection selection = GuessStrategy.select(space, allGuessIndices);

        int[] sampledSecrets = selection.secretIndices();
        int[] sampledGuesses = selection.guessIndices();

        // Sizes should match the sampling helper calculations
        assertEquals(expectedSampledSecrets, sampledSecrets.length,
                "Sampled secrets size should match GuessStrategy.sampleSecretsCount");
        assertEquals(expectedSampledGuesses, sampledGuesses.length,
                "Sampled guesses size should match GuessStrategy.sampleGuessesCount");

        // All sampled secrets must be valid indices into the current SolutionSpace
        for (int idx : sampledSecrets) {
            assertTrue(idx >= 0 && idx < totalSecrets,
                    "Sampled secret index must be within solution-space bounds");
        }

        // All sampled guesses must be valid indices into the current SolutionSpace
        for (int idx : sampledGuesses) {
            assertTrue(idx >= 0 && idx < totalSecrets,
                    "Sampled guess index must be within solution-space bounds");
        }
    }

    /** Simulate a full game with secret 1562 (arbitrary mid-range code). */

```

These tests assume the following existing APIs/types; please align them with your actual code:

1. `MastermindSession`:
   - Has a constructor `MastermindSession(int c, int d)` that creates a fresh session with an empty history.
   - Exposes `SuggestionDetails suggestGuessWithDetails()` where:
     - `int guessIndex()` returns the index of the suggested guess in the global code index space for `(c, d)`.
     - `int rank()` returns the precomputed rank for that guess.
     - `int secretsCount()` (or similar) returns the number of remaining candidate secrets.  
   If `SuggestionDetails` uses different accessor names or is not a record, update the field access accordingly.

2. `BestFirstGuess`:
   - Exposes something like `static BestFirstGuess.Entry forParameters(int c, int d)` returning an entry with:
     - `int index()` for the best opening guess index.
     - `int rank()` for the best rank.
   If your API is different (e.g., `BestFirstGuess.of(c, d)`, fields instead of accessors, or a plain class), adjust accordingly.
   - The comment assumes the canonical best opening for `(6,4)` is the one stored in `BestFirstGuess` (often corresponding to `1123`); if you expose this differently, update the test to assert that concrete code instead (e.g. via `ConvertCode.toCode` and a hard-coded `{1,1,2,3}` expectation).

3. `ConvertCode`:
   - Has a method `static int[] toCode(int c, int d, int index)` that returns an array representing the code digits.
   If your method instead returns an `int` like `1123`, change the test to compare integers instead of arrays.

4. `SolutionSpace`:
   - Has a constructor `SolutionSpace(int c, int d)` that creates the full space, and a `size()` method returning the total number of codes.
   If you use a factory (e.g. `SolutionSpace.full(c, d)`), adjust construction accordingly.

5. `GuessStrategy` and `GuessSelection`:
   - `GuessStrategy.select(SolutionSpace space, int[] candidateGuessIndices)` returns a `GuessSelection` with:
     - `int[] secretIndices()` and `int[] guessIndices()` (or similar) for the sampled indices.
   - `GuessStrategy.sampleSecretsCount(int secrets, int guesses)` and `GuessStrategy.sampleGuessesCount(int secrets, int guesses)` are representative helper methods to compute sample sizes; if your helpers differ (e.g., different signatures or names), replace these calls with the appropriate ones or replicate the sampling size logic directly in the test.

6. Imports:
   - Ensure the test file imports the newly referenced types/methods:
     - `org.mastermind.MastermindSession`
     - `org.mastermind.BestFirstGuess`
     - `org.mastermind.SolutionSpace`
     - `org.mastermind.GuessStrategy`
     - `org.mastermind.GuessSelection`
     - `org.mastermind.ConvertCode`
   - Make sure `Math` and JUnit assertions (`assertEquals`, `assertArrayEquals`, `assertTrue`) are already imported; if not, add or adjust the relevant imports.

7. Sampling threshold:
   - The `(c=8, d=5)` configuration is chosen to likely exceed the `guesses × secrets` sampling threshold. If your threshold is significantly higher, you may need to adjust `(c, d)` or the size of the candidate guess set to guarantee that sampling is actually used (e.g., larger `c` or `d`, or restrict `allGuessIndices` to a specific large subset).
</issue_to_address>

### Comment 6
<location path="src/main/java/org/mastermind/solver/SolutionSpace.java" line_range="64" />
<code_context>
      * @param obtainedFeedback feedback value (black * 9 + d - colorFreqTotal/2)
      */
-    public void filterSolution(int guess, int obtainedFeedback) {
+    public void filterSolution(int guessInd, int obtainedFeedback) {
+        // Read and update flag
+        final boolean isFirst = isFirstFilter;
</code_context>
<issue_to_address>
**issue (complexity):** Consider simplifying the first-filter optimization by using a single range-filter strategy per call and encapsulating the incremental state into a dedicated helper class to reduce branching and make the special path clearer.

You can keep the first-filter optimization while reducing the extra execution mode complexity mostly by:

1. **Collapsing the `isFirst` branching into a single “strategy” variable per call**, instead of repeating ternaries in both sequential and parallel paths.
2. **Encapsulating the incremental-state plumbing in `filterRangeFirst`** so its preconditions and state are clearer and less error‑prone.

### 1. Replace boolean branching with a range-filter strategy

Right now `filterSolution` has duplicated ternary logic:

```java
size -= isFirst ?
        filterRangeFirst(guessInd, obtainedFeedback, 0, totalCodes) :
        filterRange(guessInd, obtainedFeedback, 0, totalCodes);

// ...

futures[taskCount++] = isFirst ?
        POOL.submit(() -> filterRangeFirst(guessInd, obtainedFeedback, from, to)) :
        POOL.submit(() -> filterRange(guessInd, obtainedFeedback, from, to));

// ...

int removed = isFirst ?
        filterRangeFirst(guessInd, obtainedFeedback, fromIndex, totalCodes) :
        filterRange(guessInd, obtainedFeedback, fromIndex, totalCodes);
```

Introduce a tiny internal functional interface and select the strategy once per call:

```java
@FunctionalInterface
private interface RangeFilter {
    int apply(int guessInd, int obtainedFeedback, int from, int to);
}
```

Then `filterSolution` can be simplified to:

```java
public void filterSolution(int guessInd, int obtainedFeedback) {
    final boolean isFirst = isFirstFilter;
    if (isFirst) isFirstFilter = false;

    final RangeFilter rangeFilter = isFirst ? this::filterRangeFirst : this::filterRange;

    if (size < PARALLEL_THRESHOLD) {
        size -= rangeFilter.apply(guessInd, obtainedFeedback, 0, totalCodes);
        return;
    }

    int parallelism  = POOL.getParallelism();
    int words        = (totalCodes + 63) >>> 6;
    int wordsPerTask = Math.max(1, (words + parallelism - 1) / parallelism);

    @SuppressWarnings("unchecked")
    Future<Integer>[] futures = new Future[parallelism];
    int fromIndex = 0;
    int taskCount = 0;

    while (fromIndex + wordsPerTask * 64 < totalCodes) {
        final int from = fromIndex;
        final int to   = fromIndex + wordsPerTask * 64;
        futures[taskCount++] = POOL.submit(
                () -> rangeFilter.apply(guessInd, obtainedFeedback, from, to)
        );
        fromIndex = to;
    }

    int removed = rangeFilter.apply(guessInd, obtainedFeedback, fromIndex, totalCodes);

    for (int i = 0; i < taskCount; i++) {
        try {
            removed += futures[i].get();
        } catch (Exception e) {
            throw new RuntimeException(e);
        }
    }
    size -= removed;
}
```

This keeps the exact behavior (including parallelization and first-filter optimization) but reduces the cognitive overhead: callers only see “one filtering algorithm” per invocation, expressed as `rangeFilter`.

### 2. Encapsulate incremental state in `filterRangeFirst`

`filterRangeFirst` currently threads multiple pieces of state and an ad‑hoc `result` array:

```java
int[] guessDigits      = init.guessDigits();
int[] secretDigits     = init.secretDigits();
int[] colorFreqCounter = init.colorFreqCounter();
int   black            = init.black();
int   colorFreqTotal   = init.colorFreqTotal();
// ...
int[] result = new int[3];
for (int i = from + 1; i < to; i++) {
    FeedbackIncremental.getFeedbackIncremental(guessDigits, secretDigits, black, colorFreqCounter,
                                               colorFreqTotal, c, d, result);
    black = result[1];
    colorFreqTotal = result[2];
    // ...
}
```

You can wrap this into a tiny state holder so the invariants and mutation points are obvious and less fragile:

```java
private static final class IncrementalState {
    final int[] guessDigits;
    final int[] secretDigits;
    final int[] colorFreqCounter;
    int black;
    int colorFreqTotal;

    IncrementalState(FeedbackIncremental.State init) {
        this.guessDigits      = init.guessDigits();
        this.secretDigits     = init.secretDigits();
        this.colorFreqCounter = init.colorFreqCounter();
        this.black            = init.black();
        this.colorFreqTotal   = init.colorFreqTotal();
    }
}

private int filterRangeFirst(int guessInd, int obtainedFeedback, int from, int to) {
    IncrementalState s = new IncrementalState(
            FeedbackIncremental.setupIncremental(guessInd, from, c, d)
    );
    int feedback0 = s.black * 9 + d - (s.colorFreqTotal >>> 1);

    int removed = 0;
    if (feedback0 != obtainedFeedback) {
        remaining.clear(from);
        removed++;
    }

    int[] result = new int[3];
    for (int i = from + 1; i < to; i++) {
        FeedbackIncremental.getFeedbackIncremental(
                s.guessDigits, s.secretDigits, s.black, s.colorFreqCounter,
                s.colorFreqTotal, c, d, result
        );
        s.black          = result[1];
        s.colorFreqTotal = result[2];

        if (result[0] != obtainedFeedback) {
            remaining.clear(i);
            removed++;
        }
    }
    return removed;
}
```

This doesn’t change the logic but makes the “incremental state” explicit and easier to reason about and test in isolation if needed.

These two changes keep your first-filter optimization and parallel behavior intact while reducing branching and making the specialized path more maintainable.
</issue_to_address>

### Comment 7
<location path="src/main/java/org/mastermind/codes/CanonicalCode.java" line_range="60" />
<code_context>
-        // Base case: Code is complete
-        if (pos == d) {
-            results[index[0]++] = currentNum;
+    private static void generatePartitions(
+            int[] results, int[] index, int[] parts, int depth, int remaining, int maxVal, int maxParts, int[] place
+    ) {
</code_context>
<issue_to_address>
**issue (complexity):** Consider refactoring the partition generation to use a frequency array per color and a separate index-mapping function to make the canonical-form invariants explicit and easier to follow.

You can keep the partition-based approach but make the invariants explicit and reduce the “mental indirection” by:

1. Representing partitions as a frequency array (one slot per color) instead of a variable-length `parts` list.
2. Separating “generate all frequency arrays” from “map a frequency array to canonical index”.

This keeps your integer-partition logic but makes the code much closer to the “bucket of color counts” domain model and easier to reason about.

### 1. Generate frequency arrays directly

Instead of passing `parts` as a list of part sizes with `depth/remaining/maxVal`, generate an array `freq[color]` where `sum(freq) == d`, `freq[0] >= freq[1] >= ...`, and you stop at `c` colors.

```java
public static int[] enumerateCanonicalForms(int c, int d) {
    int[] results = new int[countCanonicalForms(c, d)];
    int[] index   = { 0 };

    int[] place   = new int[d];
    place[d - 1] = 1;
    for (int i = d - 2; i >= 0; i--) {
        place[i] = place[i + 1] * c;
    }

    int[] freq = new int[c]; // color frequencies, non-increasing
    generateFrequencies(results, index, freq, 0, d, d, place);
    return results;
}

private static void generateFrequencies(
        int[] results, int[] index, int[] freq,
        int color, int remaining, int maxFreq, int[] place
) {
    if (remaining == 0) {
        results[index[0]++] = buildIndexFromFreq(freq, place);
        return;
    }
    if (color == freq.length) {
        return;
    }

    int limit = Math.min(maxFreq, remaining);
    for (int f = limit; f >= 1; f--) {
        freq[color] = f;
        generateFrequencies(results, index, freq, color + 1, remaining - f, f, place);
        freq[color] = 0; // optional: clarify reuse
    }
}
```

This is still integer-partition generation, but the data structure matches the conceptual “frequency of each canonical color”.

### 2. Make the index mapping purely about frequencies

`buildIndex` currently operates on the partition vector; rewriting it to take a frequency array makes the mapping invariant explicit (“color 0 has freq[0], occupies leftmost positions, etc.”):

```java
private static int buildIndexFromFreq(int[] freq, int[] place) {
    int ind = 0;
    int pos = 0;
    for (int color = 0; color < freq.length; color++) {
        for (int k = 0; k < freq[color]; k++) {
            ind += color * place[pos++];
        }
    }
    return ind;
}
```

This removes the need to reason about `parts` vs. “number of parts” and the implicit mapping between partition order and color indices. The frequency array is the canonical “bucket” representation, and the mapping from bucket to index is localized and obvious.
</issue_to_address>

### Comment 8
<location path="src/main/java/org/mastermind/solver/BestFirstGuess.java" line_range="15" />
<code_context>
+ * Use {@link #of(int, int)} at runtime. Run {@link #main(String[])} once to
+ * regenerate the hardcoded values after algorithm changes.
+ */
+public class BestFirstGuess {
+
+    // --- Calibration constants ---
</code_context>
<issue_to_address>
**issue (complexity):** Consider extracting the calibration and Monte Carlo logic into a separate `BestFirstGuessCalibrator` class so that `BestFirstGuess` remains a small, runtime-only API around the precomputed table.

You can keep all functionality while reducing complexity by extracting the calibration tooling into a separate class. That way `BestFirstGuess` is a small, runtime-only API, and all Monte‑Carlo/statistics logic lives elsewhere.

### 1. Keep `BestFirstGuess` runtime-only

Keep just the `of(int c, int d)` API and the hardcoded table in `BestFirstGuess`:

```java
// BestFirstGuess.java
package org.mastermind.solver;

import org.mastermind.codes.AllValidCode;
import org.mastermind.codes.ConvertCode;

public class BestFirstGuess {

    public static long[] of(int c, int d) {
        if (c < 2 || c > 9 || d < 1 || d > 9)
            throw new IllegalArgumentException("Unsupported game size: c=" + c + ", d=" + d);

        // d=1: single digit, all guesses equivalent — always guess 1
        if (d == 1) {
            int[] allSecrets = AllValidCode.generateAllCodes(c, 1);
            long rank = new ExpectedSize(1)
                    .calcExpectedRank(ConvertCode.toIndex(c, 1, 1), allSecrets, c, 1, new int[100]);
            return new long[]{1, rank};
        }

        return switch (c) {
            // ... unchanged switch with precomputed values ...
        };
    }
}
```

### 2. Move calibration to a dedicated tool class

Extract `main`, `evaluate`, `normalCDF`, and `Result` into something like `BestFirstGuessCalibrator`:

```java
// BestFirstGuessCalibrator.java
package org.mastermind.solver;

import org.mastermind.codes.CanonicalCode;
import org.mastermind.codes.ConvertCode;
import org.mastermind.codes.SampledCode;

final class BestFirstGuessCalibrator {

    private static final int    TRIALS               = 100;
    private static final long   TARGET_EVALS         = 13_000_000L;
    private static final double CONFIDENCE_THRESHOLD = 99.0;
    private static final int    BUDGET_MULTIPLIER    = 2;

    public static void main(String[] args) {
        // ... your existing main logic, unchanged ...
        // still prints the switch snippet for BestFirstGuess.of(...)
    }

    private static Result evaluate(
            int[] canonical, int c, int d, int totalCodes,
            long budget, int[] feedbackFreq, ExpectedSize expectedSize
    ) {
        // ... existing evaluate body unchanged ...
    }

    private static double normalCDF(double z) {
        // ... existing normalCDF body unchanged ...
    }

    private record Result(
            int bestIdx, double avgScore, double confidence,
            long totalEvals, boolean fullEval
    ) { }
}
```

### 3. Optional: Make the tool clearly non‑production

If you want to avoid any confusion about its purpose:

```java
// mark as internal/offline use
/**
 * Offline calibration tool for regenerating the hardcoded table in {@link BestFirstGuess}.
 * Not used at runtime.
 */
final class BestFirstGuessCalibrator {
    // ...
}
```

This keeps all behavior, but:

- `BestFirstGuess` is now a small, focused runtime class.
- Calibration/statistical complexity is isolated and easier to modify or even move to a separate module later.
</issue_to_address>

### Comment 9
<location path="src/main/java/org/mastermind/solver/FeedbackIncremental.java" line_range="97" />
<code_context>
+            secretDigits[p] = newDigit;
+
+            // Update colorFreqCounter and colorFreqTotal
+            int v;
+            if (gDigit == oldDigit) {
+                // Was black, now not: black--, apply newDigit contribution
</code_context>
<issue_to_address>
**issue (complexity):** Consider extracting the repeated frequency/absolute-sum updates into a helper and optionally adding a wrapper class to encapsulate the incremental state to make the logic easier to reason about and use safely.

You can reduce the subtlety of the incremental update logic by factoring out the repeated “update counter + absolute sum” pattern and slightly tightening the branching. This keeps the optimization but makes the invariants easier to reason about and change.

### 1. Extract a small helper for colorFreqCounter / colorFreqTotal

Right now each branch reimplements:

```java
v = colorFreqCounter[idx];
colorFreqTotal += (v >= 0 ? 1 : -1);
colorFreqCounter[idx] = v + 1;
// and symmetric for v - 1, using (v <= 0 ? 1 : -1)
```

This duplication is where the subtlety lives. You can encapsulate the invariant “colorFreqTotal is sum of |colorFreqCounter[i]|” in a single helper:

```java
private static int applyFreqDelta(int[] colorFreqCounter, int idx, int delta, int colorFreqTotal) {
    int v = colorFreqCounter[idx];
    // removing old |v|
    colorFreqTotal -= (v >= 0 ? v : -v);
    v += delta;
    colorFreqCounter[idx] = v;
    // adding new |v|
    colorFreqTotal += (v >= 0 ? v : -v);
    return colorFreqTotal;
}
```

Then the branches in `getFeedbackIncremental` become shorter and less error-prone:

```java
if (gDigit == oldDigit) {
    // Was black, now not: black--, apply newDigit contribution
    black--;
    colorFreqTotal = applyFreqDelta(colorFreqCounter, gDigit, +1, colorFreqTotal);
    colorFreqTotal = applyFreqDelta(colorFreqCounter, newDigit, -1, colorFreqTotal);
} else if (gDigit == newDigit) {
    // Was not black, now black: black++, undo oldDigit contribution
    black++;
    colorFreqTotal = applyFreqDelta(colorFreqCounter, gDigit, -1, colorFreqTotal);
    colorFreqTotal = applyFreqDelta(colorFreqCounter, oldDigit, +1, colorFreqTotal);
} else {
    // Neither black: gDigit updates cancel, only oldDigit and newDigit change
    colorFreqTotal = applyFreqDelta(colorFreqCounter, oldDigit, +1, colorFreqTotal);
    colorFreqTotal = applyFreqDelta(colorFreqCounter, newDigit, -1, colorFreqTotal);
}
```

This keeps the same behavior but centralizes the absolute‑sum invariant in one place. Any future fix or tweak to how `colorFreqTotal` is computed is now localized.

### 2. Optional: expose a higher-level wrapper around the low-level API

To make usage less error-prone without removing the current static API, you could add a small wrapper that owns the arrays and enforces the “sequential next secret” contract:

```java
public final class IncrementalFeedbackRunner {
    private final int c, d;
    private final int[] guessDigits;
    private final int[] secretDigits;
    private final int[] colorFreqCounter;
    private int black;
    private int colorFreqTotal;

    public IncrementalFeedbackRunner(int guessInd, int secretInd, int c, int d) {
        FeedbackIncremental.State state = FeedbackIncremental.setupIncremental(guessInd, secretInd, c, d);
        this.c = c;
        this.d = d;
        this.guessDigits = state.guessDigits();
        this.secretDigits = state.secretDigits();
        this.colorFreqCounter = state.colorFreqCounter();
        this.black = state.black();
        this.colorFreqTotal = state.colorFreqTotal();
    }

    public int nextFeedback() {
        int[] out = new int[3];
        FeedbackIncremental.getFeedbackIncremental(
                guessDigits, secretDigits, black, colorFreqCounter,
                colorFreqTotal, c, d, out
        );
        this.black = out[1];
        this.colorFreqTotal = out[2];
        return out[0];
    }
}
```

Callers that don’t want to manage the low-level arrays can use `IncrementalFeedbackRunner`, while existing performance‑critical code can continue calling the static method directly.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

src/main/java/org/mastermind/codes/SampledCode.java

src/tests/java/org/mastermind/codes/SampledCodeTest.java

src/tests/java/org/mastermind/solver/ExpectedSizeTest.java

src/tests/java/org/mastermind/solver/SolutionSpaceTest.java

src/tests/java/org/mastermind/MastermindSessionTest.java

src/main/java/org/mastermind/solver/SolutionSpace.java

src/main/java/org/mastermind/codes/CanonicalCode.java

src/main/java/org/mastermind/solver/BestFirstGuess.java

src/main/java/org/mastermind/solver/FeedbackIncremental.java

… for clarity

…ssCalculator

…cases

…ter path

…tegration

FlysonBot added 14 commits March 1, 2026 21:54

feat: add BitSet version of AllValidCode and CanonicalCode along with…

6a5ba8d

… ConvertCode utils for index and code conversion

refactor: pass SolutionSpace directly from MastermindSession to Guess…

594f3d6

…Strategy

refactor: move from code to index representation to avoid overhead

a1e7640

- getFeedback now take index and extract digit directly, so its caller doesn't have to waste time generate code before getFeedback can be called - This change propagated through all methods

refactor: set threads as daemon threads so they shut down automatically

1ce6b5c

feat: add getValidSample method to sample only valid secrets

58b3859

refactor: huge improvement on CanonicalCode by utilizing positional s…

3ae0da1

…ymmetry Now CanonicalCode can cut down the guess space for 9x9 Mastermind from 9^9 to 30.

feat: add BestFirstGuess to precompute and hardcode optimal first gue…

c154a1b

…sses through either brute-force or sampling

refactor: reorganize java classes and benchmarks into folders

65efc6a

fix: ExpectedSize.convertSampleRankToExpectedSize overflow issue

cd61ae1

fix: suggestGuessWithDetails call getSecrets().length on first guess

5d94aed

docs: add inline comments explaining key algorithms

c504bb9

- Explain counter algorithm for white-peg computation in Feedback - Clarify GuessStrategy tolerance/percentile sampling heuristics - Add minor inline comments in various places - Update CLAUDE.md status and current focus section

FlysonBot self-assigned this Mar 5, 2026

chore: remove currently unneeeded GitHub workflow

63025f8

FlysonBot marked this pull request as ready for review March 5, 2026 04:27

sourcery-ai bot reviewed Mar 5, 2026

View reviewed changes

FlysonBot added 8 commits March 4, 2026 21:06

refactor: rename parititon params in CanonicalCode to frequency/color…

622ce51

… for clarity

perf: replace new Random() with ThreadLocalRandom in SampledCode

33acbda

refactor: extract computing logic from BestFirstGuess to BestFirstGue…

13d31b8

…ssCalculator

test: add getValidSample tests covering both sampling paths and edge …

143309a

…cases

test: add getValidSample tests covering both sampling paths and edge …

77c988a

…cases

test: add two-filter test to SolutionSpace convering the standard fil…

91551a7

…ter path

test: add suggestGuessWithDetails test verifying first-guess table in…

479f5c5

…tegration

test: more test to improve coverage

fd18153

FlysonBot merged commit d871d59 into rewrite Mar 5, 2026
2 of 3 checks passed

FlysonBot deleted the rewrite-bitset branch March 5, 2026 08:45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: switch solver to index-based representation with optimized first-guess strategy#56

feat: switch solver to index-based representation with optimized first-guess strategy#56
FlysonBot merged 23 commits intorewritefrom
rewrite-bitset

FlysonBot commented Mar 5, 2026 •

edited by sourcery-ai bot

Loading

Uh oh!

sourcery-ai bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

FlysonBot commented Mar 5, 2026 • edited by sourcery-ai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

OVERVIEW

Summary by Sourcery

Uh oh!

sourcery-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

FlysonBot commented Mar 5, 2026 •

edited by sourcery-ai bot

Loading