[LEADS-232] Bump up dependencies for Python 3.11, 3.12, 3.13 support by bsatapat-jpg · Pull Request #189 · lightspeed-core/lightspeed-evaluation

bsatapat-jpg · 2026-03-13T16:06:19Z

Description

Type of change

Tools used to create PR

Identify any AI code assistants used in this PR (for transparency and review context)

Assisted-by: Claude-4.5-opus-high
Generated by: Cursor

Related Tickets & Documents

Related Issue # https://redhat.atlassian.net/browse/LEADS-232
Closes #

Checklist before requesting a review

I have performed a self-review of my code.
PR has passed all pre-merge test jobs.
If it is a core feature, I have added thorough tests.

Testing

Please provide detailed steps to perform tests related to this code change.
How were the fix/results from this change verified? Please provide relevant screenshots or results.

Summary by CodeRabbit

Dependencies
- Upgraded many core, data, embedding, and developer packages (including ragas, litellm, deepeval, pydantic, pandas, datasets, scipy, torch, sentence-transformers, ruff, pytest-cov).
New Features
- Added an NLP-metrics group and expanded provider integration for local embeddings and LLMs.
Bug Fixes
- Improved SSL handling, async compatibility, token-tracking, logging, and robustness of metric scoring.
Tests
- Updated unit tests to align with the new metrics and provider APIs.

asamal4

Thanks !!
We should not be using internal modules. check the public method for this

src/lightspeed_evaluation/core/llm/ragas.py

src/lightspeed_evaluation/core/metrics/nlp.py

asamal4

Please check the latest version, ex: ragas 4 is available, similarly deepeval 3..
Check for other packages as well.

pyproject.toml

asamal4 · 2026-03-16T09:07:51Z

@coderabbitai review

coderabbitai · 2026-03-16T09:07:58Z

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

coderabbitai · 2026-03-16T09:08:05Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

Walkthrough

Upgrades ragas to 0.4+ and bumps many dependencies; migrates metric APIs to collections-based .score() calls; replaces custom ragas LLM/embedding wrappers with factory-based managers using litellm; centralizes litellm SSL and token-tracking patches; updates tests and mocks to new APIs.

Changes

Cohort / File(s)	Summary
Dependency Manifest `pyproject.toml`	Bumped core deps (ragas 0.3.0→0.4.0, deepeval, litellm, pydantic, pandas, datasets, scipy, torch, sentence-transformers); added NLP-metrics extras (sacrebleu, rouge-score, rapidfuzz); dev-tool updates (ruff, pytest-cov).
Constants / Imports `src/lightspeed_evaluation/core/constants.py`	Updated import path for `DistanceMeasure` to `ragas.metrics.collections`.
LLM: Ragas Manager `src/lightspeed_evaluation/core/llm/ragas.py`	Removed `RagasCustomLLM`; added `RagasLLMManager` using `ragas.llms.llm_factory` with provider="litellm", async client, SSL setup; added `get_llm()` and `get_model_info()`.
LLM: SSL & Patching `src/lightspeed_evaluation/core/llm/litellm_patch.py`, `src/lightspeed_evaluation/core/llm/custom.py`	New `setup_litellm_ssl()` centralizes SSL config and forces `litellm.drop_params`; adds no-op logging worker, wraps `litellm.completion`/`litellm.acompletion` for token-tracking; `custom` delegates SSL setup to this function and removes local SSL logic.
Embedding Factory `src/lightspeed_evaluation/core/embedding/ragas.py`	Refactored to use `ragas.embedding_factory` with litellm provider; unified provider→model string resolution; removes provider-specific embedding wrappers and disk-cache backend.
Metrics: Ragas → Collections `src/lightspeed_evaluation/core/metrics/ragas.py`, `src/lightspeed_evaluation/core/metrics/nlp.py`	Migrated to ragas 0.4+ collections API: instantiate metric classes (AnswerRelevancy, Faithfulness, ContextUtilization, Rouge, similarity, etc.) and call `.score(response=..., reference=...)`; extract numeric score from `result.value`; removed dataset-based evaluation helpers.
Minor Logic Fix `src/lightspeed_evaluation/core/metrics/geval.py`	Safer logging for `test_case.actual_output` (convert to `str` when present, preserve `None`).
Tests: Mock Adjustments `tests/unit/core/llm/test_custom.py`, `tests/unit/core/metrics/conftest.py`, `tests/unit/core/metrics/test_nlp.py`	Updated mocks and fixtures to target `score()` and MetricResult-like return values; adjusted litellm mocking target to `lightspeed_evaluation.core.llm.litellm_patch.litellm`; replaced `single_turn_score` usages with `score()` side_effects/returns.

Sequence Diagram(s)

mermaid
sequenceDiagram
participant Client
participant RagasLLMManager
participant ragas_factory as ragas.llms.llm_factory
participant litellm
participant TokenTracker as track_tokens

Client->>RagasLLMManager: request LLM (get_llm / generate)
RagasLLMManager->>litellm: call setup_litellm_ssl(llm_params)
RagasLLMManager->>ragas_factory: llm_factory(model, provider="litellm", client=litellm.acompletion)
ragas_factory-->>RagasLLMManager: LLM instance
RagasLLMManager->>litellm: call acompletion / completion (wrapped)
litellm->>TokenTracker: response passed to track_tokens (wrapping)
TokenTracker-->>RagasLLMManager: token-tracked response
RagasLLMManager-->>Client: return response / result

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly describes the main change: updating dependencies to support Python 3.11, 3.12, and 3.13, which aligns with the extensive version bumps across multiple packages shown in pyproject.toml and related files.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

📝 Coding Plan

Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Tip

CodeRabbit can generate a title for your PR based on the changes with custom instructions.

Set the reviews.auto_title_instructions setting to generate a title for your PR based on the changes in the PR with custom instructions.

coderabbitai

Actionable comments posted: 4

🧹 Nitpick comments (1)

pyproject.toml (1)
33-35: Remove NLP metric packages from base dependencies to make the nlp-metrics extra meaningful.

These three packages are duplicated in both base and optional dependencies. Since sacrebleu, rouge-score, and rapidfuzz are already in base dependencies, the nlp-metrics extra is redundant. Users installing the base package will always get these NLP metric libraries even if they don't need them. Move these packages exclusively to the optional dependency to align with the documented intent (lines 50-54) that they should only be installed when needed.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pyproject.toml` around lines 33 - 35, Remove "sacrebleu>=2.0.0",
"rouge-score>=0.1.2", and "rapidfuzz>=3.0.0" from the base dependencies block
and instead list them solely under the optional extra named "nlp-metrics" so
those libraries are installed only when the nlp-metrics extra is requested;
update the extras table so "nlp-metrics" contains exactly these three package
entries and ensure there are no duplicate entries remaining in the base
dependency list.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pyproject.toml`:
- Line 12: The dependency constraint "ragas>=0.3.5" is too loose given our
imports of private modules like ragas.metrics._* (used in constants.py,
llm/ragas.py, metrics/ragas.py, metrics/nlp.py); tighten the constraint to a
tested range (for example pin to a known-compatible release range such as
ragas>=0.3.5,<0.4.0 or a specific tested version) in pyproject.toml, or
alternatively update the imports in constants.py, llm/ragas.py,
metrics/ragas.py, and metrics/nlp.py to use public ragas APIs only so we no
longer depend on private modules.

In `@src/lightspeed_evaluation/core/constants.py`:
- Line 4: Replace the private-module import of DistanceMeasure in constants.py
(currently "from ragas.metrics._string import DistanceMeasure") with the public
API import; update the import to use "from ragas.metrics import DistanceMeasure"
for compatibility with ragas 0.3.5+, or to "from ragas.metrics.collections
import DistanceMeasure" if you target ragas 0.4.x+, ensuring all usages of
DistanceMeasure in this module continue to work with the new import.

In `@src/lightspeed_evaluation/core/metrics/ragas.py`:
- Around line 98-101: Replace the silent return branch where to_pandas_fn is
missing by raising the project-specific MetricError and emitting a structured
error log; specifically, import MetricError from core.system.exceptions, use the
module logger (e.g., the existing logger in this file) to log a structured error
including metric_name and the missing method, then raise MetricError with a
clear message so evaluate can handle it—update the branch around to_pandas_fn
(and the cast call) to perform these steps instead of returning (None, "...").
- Around line 13-20: Replace the private-module imports (those starting with
underscores) with ragas' public APIs: import ResponseRelevancy, Faithfulness,
LLMContextRecall, LLMContextPrecisionWithReference, and ContextRelevance from
ragas.metrics for ragas 0.3.5–0.3.x or from ragas.metrics.collections for ragas
0.4.0+, and remove imports from ragas.metrics._answer_relevance,
_context_precision, _context_recall, _faithfulness, and _nv_metrics; also check
whether LLMContextPrecisionWithoutReference is available in the public API—if it
is, import it from the same public module, otherwise remove references to
LLMContextPrecisionWithoutReference and update any code using it to use
LLMContextPrecisionWithReference or another supported metric (refer to the
symbols ResponseRelevancy, Faithfulness, LLMContextRecall,
LLMContextPrecisionWithReference, ContextRelevance, and
LLMContextPrecisionWithoutReference to locate and update usages).

---

Nitpick comments:
In `@pyproject.toml`:
- Around line 33-35: Remove "sacrebleu>=2.0.0", "rouge-score>=0.1.2", and
"rapidfuzz>=3.0.0" from the base dependencies block and instead list them solely
under the optional extra named "nlp-metrics" so those libraries are installed
only when the nlp-metrics extra is requested; update the extras table so
"nlp-metrics" contains exactly these three package entries and ensure there are
no duplicate entries remaining in the base dependency list.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: c340a46a-d9cd-4f59-a512-b265d0c738c7

📥 Commits

Reviewing files that changed from the base of the PR and between 59749df and 460c2ac.

⛔ Files ignored due to path filters (2)

uv-gpu.lock is excluded by !**/*.lock
uv.lock is excluded by !**/*.lock

📒 Files selected for processing (5)

pyproject.toml
src/lightspeed_evaluation/core/constants.py
src/lightspeed_evaluation/core/llm/ragas.py
src/lightspeed_evaluation/core/metrics/nlp.py
src/lightspeed_evaluation/core/metrics/ragas.py

pyproject.toml

src/lightspeed_evaluation/core/constants.py

src/lightspeed_evaluation/core/metrics/ragas.py

coderabbitai

Actionable comments posted: 4

🧹 Nitpick comments (4)

config/system.yaml (1)
20-35: Re-enable realistic sample defaults for cache/API flags.

Line 20, Line 28, and Line 35 make the sample config less representative for normal usage by default. Consider keeping these enabled in config/system.yaml and overriding in environment-specific configs instead.

Based on learnings, "The maintainer asamal4 prefers sample configurations in config/system.yaml to show realistic usage patterns (e.g., API enabled: true) rather than conservative defaults, as teams are expected to customize the configuration for their specific environments."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@config/system.yaml` around lines 20 - 35, Update the sample defaults to
enable realistic behavior by setting cache_enabled: true (top-level),
embedding.cache_enabled: true (under embedding) and api.enabled: true (under
api); keep the existing embedding.provider and embedding.model values unchanged
and ensure environment-specific configs can override these defaults.
pyproject.toml (2)
33-59: Avoid duplicating NLP metric dependencies in both base and optional extras.

sacrebleu, rouge-score, and rapidfuzz are now in core dependencies (Line 33-35) and still in nlp-metrics extra (Line 56-58). Keep a single source of truth to prevent drift and confusion.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pyproject.toml` around lines 33 - 59, The NLP metric packages sacrebleu,
rouge-score, and rapidfuzz are duplicated in the core dependencies and the
nlp-metrics extra; remove the duplicates from one place to keep a single source
of truth — specifically, delete "sacrebleu>=2.0.0", "rouge-score>=0.1.2", and
"rapidfuzz>=3.0.0" from the core dependency list (the top-level dependencies
block) and keep them only in the nlp-metrics optional-dependencies array so the
nlp-metrics extra remains the canonical location for these packages.
12-47: Remove duplicate NLP metrics dependencies from the base dependencies list.

The packages sacrebleu, rouge-score, and rapidfuzz are currently declared in both the main dependencies list (lines 33–35) and the nlp-metrics optional dependency group (lines 47–51). Since these are only required when using NLP metrics features, they should be removed from the base dependencies and kept only in the optional group to reduce installation size for users who don't need them.

Additionally, verify that each bumped minimum dependency version is compatible with Python 3.13 before merging, given the project's support for Python >=3.11,<3.14.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pyproject.toml` around lines 12 - 47, Remove the duplicate NLP metric
packages from the main dependencies list by deleting "sacrebleu", "rouge-score",
and "rapidfuzz" from the primary dependencies array and keep them only in the
[project.optional-dependencies] nlp-metrics group; then run a quick
compatibility check to ensure the bumped minimum versions (e.g., "ragas>=0.4.0",
"deepeval>=3.8.0", "litellm>=1.50.0", "pydantic>=2.10.0", etc.) are compatible
with Python 3.13 given the project's Python requirement (>=3.11,<3.14) and
adjust versions if any incompatibility is found.
src/lightspeed_evaluation/core/llm/litellm_patch.py (1)
19-25: Narrow the warning filter scope to LiteLLM-specific contexts.

This global filter can hide unrelated coroutine warnings across the process. Restrict by module/message scope so real async bugs remain visible.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/lightspeed_evaluation/core/llm/litellm_patch.py` around lines 19 - 25,
The current global warnings.filterwarnings call is too broad; narrow it to
LiteLLM-specific contexts by adding a module or more specific message pattern so
unrelated coroutine warnings aren't suppressed. Update the existing
warnings.filterwarnings invocation (the call to warnings.filterwarnings in
litellm_patch.py) to include a module argument like module="litellm" or tighten
the message regex to include 'litellm' (e.g., message=".*litellm.*coroutine.*was
never awaited") so only warnings originating from LiteLLM async logging are
ignored.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/lightspeed_evaluation/core/llm/litellm_patch.py`:
- Around line 84-95: setup_litellm_ssl currently ignores
llm_params['ssl_cert_file']; update the function so it prefers a configured cert
file when present: if llm_params contains 'ssl_cert_file' and it is truthy, set
litellm.ssl_verify to that file path; otherwise fall back to the
SSL_CERTIFI_BUNDLE env var if set, and finally to the boolean value of
llm_params.get('ssl_verify', True); ensure you still set
litellm.ssl_verify=False when ssl_verify is explicitly false in llm_params so
the logic uses the cert file > env var > configured boolean, referencing
setup_litellm_ssl, llm_params['ssl_cert_file'], and litellm.ssl_verify.

In `@src/lightspeed_evaluation/core/llm/ragas.py`:
- Line 50: Replace the free-form message log in ragas.py so the model name is
emitted as a structured field instead of interpolated text: change the
logger.info call (the line using logger.info and self.model_name) to log a
concise message and pass model_name as structured data (e.g., logger.info("Ragas
LLM Manager configured", extra={"model_name": self.model_name})). This makes
model_name queryable in logs while keeping the same log level and intent.

In `@src/lightspeed_evaluation/core/metrics/ragas.py`:
- Line 14: Replace the legacy ContextUtilization import and usages with the
modern ContextPrecisionWithoutReference: update the import statement that
currently imports ContextUtilization to instead import
ContextPrecisionWithoutReference from ragas.metrics.collections, then update
every instantiation/usage of the ContextUtilization class (the code that creates
or references ContextUtilization in the metrics setup, previously around the
block handling context precision without reference) to use
ContextPrecisionWithoutReference so the metric and its parameters are compatible
with ragas v0.4+.

---

Nitpick comments:
In `@config/system.yaml`:
- Around line 20-35: Update the sample defaults to enable realistic behavior by
setting cache_enabled: true (top-level), embedding.cache_enabled: true (under
embedding) and api.enabled: true (under api); keep the existing
embedding.provider and embedding.model values unchanged and ensure
environment-specific configs can override these defaults.

In `@pyproject.toml`:
- Around line 33-59: The NLP metric packages sacrebleu, rouge-score, and
rapidfuzz are duplicated in the core dependencies and the nlp-metrics extra;
remove the duplicates from one place to keep a single source of truth —
specifically, delete "sacrebleu>=2.0.0", "rouge-score>=0.1.2", and
"rapidfuzz>=3.0.0" from the core dependency list (the top-level dependencies
block) and keep them only in the nlp-metrics optional-dependencies array so the
nlp-metrics extra remains the canonical location for these packages.
- Around line 12-47: Remove the duplicate NLP metric packages from the main
dependencies list by deleting "sacrebleu", "rouge-score", and "rapidfuzz" from
the primary dependencies array and keep them only in the
[project.optional-dependencies] nlp-metrics group; then run a quick
compatibility check to ensure the bumped minimum versions (e.g., "ragas>=0.4.0",
"deepeval>=3.8.0", "litellm>=1.50.0", "pydantic>=2.10.0", etc.) are compatible
with Python 3.13 given the project's Python requirement (>=3.11,<3.14) and
adjust versions if any incompatibility is found.

In `@src/lightspeed_evaluation/core/llm/litellm_patch.py`:
- Around line 19-25: The current global warnings.filterwarnings call is too
broad; narrow it to LiteLLM-specific contexts by adding a module or more
specific message pattern so unrelated coroutine warnings aren't suppressed.
Update the existing warnings.filterwarnings invocation (the call to
warnings.filterwarnings in litellm_patch.py) to include a module argument like
module="litellm" or tighten the message regex to include 'litellm' (e.g.,
message=".*litellm.*coroutine.*was never awaited") so only warnings originating
from LiteLLM async logging are ignored.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 9507cdf0-ae13-431f-883b-db885af5752d

📥 Commits

Reviewing files that changed from the base of the PR and between 460c2ac and f1042f0.

⛔ Files ignored due to path filters (2)

uv-gpu.lock is excluded by !**/*.lock
uv.lock is excluded by !**/*.lock

📒 Files selected for processing (13)

config/system.yaml
pyproject.toml
src/lightspeed_evaluation/core/constants.py
src/lightspeed_evaluation/core/embedding/ragas.py
src/lightspeed_evaluation/core/llm/custom.py
src/lightspeed_evaluation/core/llm/litellm_patch.py
src/lightspeed_evaluation/core/llm/ragas.py
src/lightspeed_evaluation/core/metrics/geval.py
src/lightspeed_evaluation/core/metrics/nlp.py
src/lightspeed_evaluation/core/metrics/ragas.py
tests/unit/core/llm/test_custom.py
tests/unit/core/metrics/conftest.py
tests/unit/core/metrics/test_nlp.py

✅ Files skipped from review due to trivial changes (1)

src/lightspeed_evaluation/core/metrics/geval.py

🚧 Files skipped from review as they are similar to previous changes (1)

src/lightspeed_evaluation/core/metrics/nlp.py

src/lightspeed_evaluation/core/llm/litellm_patch.py

src/lightspeed_evaluation/core/llm/ragas.py

src/lightspeed_evaluation/core/metrics/ragas.py

bsatapat-jpg · 2026-03-17T12:45:18Z

@coderabbitai review

coderabbitai · 2026-03-17T12:45:24Z

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

coderabbitai

Actionable comments posted: 2

♻️ Duplicate comments (2)

src/lightspeed_evaluation/core/llm/litellm_patch.py (2)

84-95: ⚠️ Potential issue | 🟠 Major

Honor configured ssl_cert_file when enabling SSL verification.

Line 90-Line 93 currently ignores llm_params["ssl_cert_file"], so explicit CA bundle configuration can’t be applied through config.

🛠️ Proposed fix

 def setup_litellm_ssl(llm_params: dict[str, Any]) -> None:
@@
-    ssl_verify = llm_params.get("ssl_verify", True)
+    ssl_verify = llm_params.get("ssl_verify", True)
+    ssl_cert_file = llm_params.get("ssl_cert_file")
@@
-    if ssl_verify:
-        litellm.ssl_verify = os.environ.get("SSL_CERTIFI_BUNDLE", True)
+    if ssl_verify:
+        litellm.ssl_verify = (
+            ssl_cert_file
+            or os.environ.get("SSL_CERT_FILE")
+            or os.environ.get("REQUESTS_CA_BUNDLE")
+            or os.environ.get("SSL_CERTIFI_BUNDLE")
+            or True
+        )
     else:
         litellm.ssl_verify = False

For LiteLLM 1.50.0, what are the officially supported values for `litellm.ssl_verify` (bool vs CA bundle path), and which environment variables are documented for custom CA bundles?

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/lightspeed_evaluation/core/llm/litellm_patch.py` around lines 84 - 95,
The setup_litellm_ssl function currently ignores llm_params["ssl_cert_file"];
update setup_litellm_ssl to prefer an explicit CA bundle from
llm_params.get("ssl_cert_file") when ssl_verify is truthy, falling back to the
SSL_CERTIFI_BUNDLE env var and finally to True; set litellm.ssl_verify to the CA
bundle path (string) if provided or to a boolean False when ssl_verify is falsy
so litellm.ssl_verify reflects either a path or boolean as supported by LiteLLM,
and keep using llm_params.get("ssl_verify", True) to decide whether to disable
verification.

35-40: ⚠️ Potential issue | 🟠 Major

Remove suppression directives introduced in this production module.

Line 35-Line 40, Line 73, and Line 77-Line 79 add # pylint: disable, # type: ignore, and # noqa rather than fixing the underlying issues.

As per coding guidelines, "Do not disable lint warnings with # noqa, # type: ignore, or # pylint: disable comments - fix the underlying issue instead".

Also applies to: 73-73, 77-79

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/lightspeed_evaluation/core/llm/litellm_patch.py` around lines 35 - 40,
Remove the lint suppression comments and fix the underlying issues instead of
silencing them: replace the uses of protected attributes
(litellm.suppress_debug_info, litellm._async_success_callback,
litellm._async_failure_callback, litellm._async_input_callback) by using the
library's public API or safe accessors (e.g., provide a public setter or call a
documented init/config function), and address any type errors by adding proper
typing or runtime guards rather than using # type: ignore; also remove any #
noqa and resolve the actual style/unused import issues they were hiding (update
signatures, imports, or refactor code around the other suppression sites
referenced so everything passes linters without disable comments).

🧹 Nitpick comments (1)

pyproject.toml (1)

32-35: Avoid declaring NLP metric packages in both core and optional dependencies.

Line 33-Line 35 duplicate Line 56-Line 59. Keeping them in both places makes the extra effectively non-optional and adds dependency maintenance overhead.

♻️ Suggested cleanup

 dependencies = [
@@
-    # NLP metrics dependencies
-    "sacrebleu>=2.0.0",
-    "rouge-score>=0.1.2",
-    "rapidfuzz>=3.0.0",
 ]

Also applies to: 55-59

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@pyproject.toml` around lines 32 - 35, The pyproject.toml currently declares
NLP metric packages ("sacrebleu", "rouge-score", "rapidfuzz") in both the core
dependency section and the optional extras, making them effectively non-optional
and duplicative; remove the duplicate declarations from one place so each
package is listed only once (either keep them in the core dependencies or move
them solely under the optional extras block), update the extras group
accordingly, and ensure references to "sacrebleu", "rouge-score", and
"rapidfuzz" are present only in the chosen section to eliminate redundancy.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/lightspeed_evaluation/core/llm/ragas.py`:
- Around line 37-48: self.llm_params is populated but only ssl_verify is used;
extract the inference options (temperature, max_completion_tokens, timeout,
num_retries) from self.llm_params into an inference_kwargs dict and pass them
into llm_factory when creating self.llm so ragas does not use its defaults;
update the block around setup_litellm_ssl(self.llm_params) and the
llm_factory(...) call to include **inference_kwargs (keep provider="litellm" and
client=litellm.acompletion unchanged).

In `@src/lightspeed_evaluation/core/metrics/geval.py`:
- Around line 314-317: The inline pylint suppression should be removed and the
preview slicing should operate on a guaranteed string: coerce
test_case.actual_output to a string (e.g., preview =
str(test_case.actual_output) if test_case.actual_output is not None else
"None"), then use preview[:100] in the logging call instead of directly slicing
test_case.actual_output; remove the "# pylint: disable=unsubscriptable-object"
comment and update the log invocation that references test_case.actual_output to
use the new preview variable.

---

Duplicate comments:
In `@src/lightspeed_evaluation/core/llm/litellm_patch.py`:
- Around line 84-95: The setup_litellm_ssl function currently ignores
llm_params["ssl_cert_file"]; update setup_litellm_ssl to prefer an explicit CA
bundle from llm_params.get("ssl_cert_file") when ssl_verify is truthy, falling
back to the SSL_CERTIFI_BUNDLE env var and finally to True; set
litellm.ssl_verify to the CA bundle path (string) if provided or to a boolean
False when ssl_verify is falsy so litellm.ssl_verify reflects either a path or
boolean as supported by LiteLLM, and keep using llm_params.get("ssl_verify",
True) to decide whether to disable verification.
- Around line 35-40: Remove the lint suppression comments and fix the underlying
issues instead of silencing them: replace the uses of protected attributes
(litellm.suppress_debug_info, litellm._async_success_callback,
litellm._async_failure_callback, litellm._async_input_callback) by using the
library's public API or safe accessors (e.g., provide a public setter or call a
documented init/config function), and address any type errors by adding proper
typing or runtime guards rather than using # type: ignore; also remove any #
noqa and resolve the actual style/unused import issues they were hiding (update
signatures, imports, or refactor code around the other suppression sites
referenced so everything passes linters without disable comments).

---

Nitpick comments:
In `@pyproject.toml`:
- Around line 32-35: The pyproject.toml currently declares NLP metric packages
("sacrebleu", "rouge-score", "rapidfuzz") in both the core dependency section
and the optional extras, making them effectively non-optional and duplicative;
remove the duplicate declarations from one place so each package is listed only
once (either keep them in the core dependencies or move them solely under the
optional extras block), update the extras group accordingly, and ensure
references to "sacrebleu", "rouge-score", and "rapidfuzz" are present only in
the chosen section to eliminate redundancy.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 3f530ae3-f4d6-46a9-bb7c-03797eb78220

📥 Commits

Reviewing files that changed from the base of the PR and between f1042f0 and ba42d06.

⛔ Files ignored due to path filters (2)

uv-gpu.lock is excluded by !**/*.lock
uv.lock is excluded by !**/*.lock

📒 Files selected for processing (12)

pyproject.toml
src/lightspeed_evaluation/core/constants.py
src/lightspeed_evaluation/core/embedding/ragas.py
src/lightspeed_evaluation/core/llm/custom.py
src/lightspeed_evaluation/core/llm/litellm_patch.py
src/lightspeed_evaluation/core/llm/ragas.py
src/lightspeed_evaluation/core/metrics/geval.py
src/lightspeed_evaluation/core/metrics/nlp.py
src/lightspeed_evaluation/core/metrics/ragas.py
tests/unit/core/llm/test_custom.py
tests/unit/core/metrics/conftest.py
tests/unit/core/metrics/test_nlp.py

🚧 Files skipped from review as they are similar to previous changes (2)

tests/unit/core/llm/test_custom.py
src/lightspeed_evaluation/core/constants.py

src/lightspeed_evaluation/core/llm/ragas.py

src/lightspeed_evaluation/core/metrics/geval.py

coderabbitai

♻️ Duplicate comments (3)

src/lightspeed_evaluation/core/metrics/ragas.py (1)
14-14: ⚠️ Potential issue | 🟠 Major

Replace ContextUtilization with ContextPrecisionWithoutReference for forward compatibility.

Per a previous review comment, ContextUtilization is a legacy backward-compatibility wrapper in ragas v0.4.x+. The preferred metric for "context precision without reference" is ContextPrecisionWithoutReference from ragas.metrics.collections.
Proposed fix

Update import at line 14:
 from ragas.metrics.collections import (
     AnswerRelevancy,
     ContextPrecision,
+    ContextPrecisionWithoutReference,
     ContextRecall,
     ContextRelevance,
-    ContextUtilization,
     Faithfulness,
 )
Update usage at line 172:
-        metric = ContextUtilization(llm=self.llm_manager.get_llm())
+        metric = ContextPrecisionWithoutReference(llm=self.llm_manager.get_llm())
Also applies to: 172-172
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/lightspeed_evaluation/core/metrics/ragas.py` at line 14, The code imports
and uses the legacy ContextUtilization metric; replace that import with
ContextPrecisionWithoutReference from ragas.metrics.collections and update any
instantiation/usages of ContextUtilization to use
ContextPrecisionWithoutReference instead (e.g., change the import at the top
that currently lists ContextUtilization and replace the symbol used around the
ContextUtilization usage near the metric creation at line ~172 so the code
constructs/refs ContextPrecisionWithoutReference).
src/lightspeed_evaluation/core/llm/litellm_patch.py (2)
84-98: ⚠️ Potential issue | 🟠 Major

ssl_cert_file from config is still not applied to LiteLLM SSL verification.

Per a previous review comment, llm_params.get("ssl_cert_file") is not being used. The function only reads ssl_verify and falls back to SSL_CERTIFI_BUNDLE environment variable, ignoring any custom CA path configured in ssl_cert_file.
Proposed fix to apply ssl_cert_file
 def setup_litellm_ssl(llm_params: dict[str, Any]) -> None:
     """Configure litellm SSL verification and drop_params.
 
     Args:
         llm_params: Dictionary containing LLM parameters including 'ssl_verify'
+                    and optionally 'ssl_cert_file'
     """
     ssl_verify = llm_params.get("ssl_verify", True)
+    ssl_cert_file = llm_params.get("ssl_cert_file")
 
     if ssl_verify:
-        litellm.ssl_verify = os.environ.get("SSL_CERTIFI_BUNDLE", True)
+        litellm.ssl_verify = (
+            ssl_cert_file
+            or os.environ.get("SSL_CERT_FILE")
+            or os.environ.get("REQUESTS_CA_BUNDLE")
+            or os.environ.get("SSL_CERTIFI_BUNDLE", True)
+        )
     else:
         litellm.ssl_verify = False
 
     # Always drop unsupported parameters for cross-provider compatibility
     litellm.drop_params = True
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/lightspeed_evaluation/core/llm/litellm_patch.py` around lines 84 - 98,
The setup_litellm_ssl function ignores llm_params["ssl_cert_file"] and never
applies a custom CA bundle; update setup_litellm_ssl to read ssl_cert_file from
llm_params (e.g., ssl_cert_file = llm_params.get("ssl_cert_file")), and when
ssl_verify is truthy set litellm.ssl_verify to the custom path if provided,
otherwise fall back to os.environ.get("SSL_CERTIFI_BUNDLE") (and ensure the
fallback default is a boolean or path as appropriate); keep the existing
behavior for disabling verification (litellm.ssl_verify = False) and continue to
set litellm.drop_params = True.
35-40: ⚠️ Potential issue | 🟠 Major

Multiple lint suppression directives remain in production code.

The file contains several # pylint: disable, # type: ignore, and # noqa comments. While some may be justified for interacting with litellm internals, the coding guidelines require fixing underlying issues rather than suppressing warnings.

Key suppressions:

Line 35-40: protected-access for litellm private attributes

Line 43: too-few-public-methods for _NoOpLoggingWorker

Line 73: type: ignore[assignment] for logging worker replacement

Lines 77, 79: wrong-import-position and noqa: E402 for delayed import

Lines 112, 123: broad-exception-caught in token tracking

The protected-access suppressions for litellm internals may be unavoidable given the need to disable async logging, but consider documenting why each suppression is necessary. The broad-exception-caught at lines 112 and 123 could be narrowed to specific exception types if the failure modes of track_tokens are known.

As per coding guidelines: "Do not disable lint warnings with # noqa, # type: ignore, or # pylint: disable comments - fix the underlying issue instead".

Also applies to: 43-43, 73-73, 77-79, 112-112, 123-123
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/lightspeed_evaluation/core/llm/litellm_patch.py` around lines 35 - 40,
The code currently suppresses multiple linters around interacting with litellm
internals (litellm.suppress_debug_info, litellm._async_success_callback,
litellm._async_failure_callback, litellm._async_input_callback) and silences
other checks for _NoOpLoggingWorker and a logging-worker replacement; instead of
blanket disables, either use public APIs or add a short justificatory comment
above each suppression explaining why direct access is required (e.g., disabling
litellm async logging) and limit the scope of the disable to the minimum lines,
and for track_tokens replace the broad "except Exception" handlers with specific
exception classes that you expect (or re-raise after handling) in the
functions/methods named track_tokens so linting isn’t suppressed; update
comments near _NoOpLoggingWorker and the logging-worker assignment to document
why a lightweight stub is needed and remove unrelated noqa/type:ignore where
possible.

🧹 Nitpick comments (2)

src/lightspeed_evaluation/core/embedding/ragas.py (1)
30-44: Provider mapping covers common cases but azure is not in the allowed providers list.

The provider mapping includes azure, vertex, and bedrock (lines 36-41), but according to EmbeddingConfig (from context snippet), the allowed providers are only {"openai", "huggingface", "gemini"}. This mapping will never be reached for azure/vertex/bedrock since validation would fail first.

Either:

Remove the unreachable provider mappings, or

Update EmbeddingConfig._validate_provider to allow these additional providers if they should be supported.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/lightspeed_evaluation/core/embedding/ragas.py` around lines 30 - 44, The
provider mapping in the provider -> model_str branch (handling "azure",
"vertex", "bedrock") is unreachable because EmbeddingConfig currently restricts
allowed providers to {"openai","huggingface","gemini"}; either remove those
unreachable branches (the elifs for "azure", "vertex", "bedrock") from the
provider -> model_str logic, or extend EmbeddingConfig._validate_provider to
include the additional providers ("azure","vertex_ai"/"vertex","bedrock") so
validation permits them; update any provider naming conventions so the string
produced by the provider -> model_str block matches the canonical provider
identifiers used by _validate_provider (referencing the provider variable and
the model_str assignment logic).
pyproject.toml (1)
32-36: NLP metrics dependencies are duplicated in main and optional dependencies.

The same packages (sacrebleu>=2.0.0, rouge-score>=0.1.2, rapidfuzz>=3.0.0) appear in both the main dependencies list (lines 32-35) and the nlp-metrics optional group (lines 56-58). If they're always required, they should only be in main dependencies. If they're optional, they should only be in the optional group.
Consider removing duplication

If NLP metrics are always required:
 dependencies = [
     ...
-    # NLP metrics dependencies
-    "sacrebleu>=2.0.0",
-    "rouge-score>=0.1.2",
-    "rapidfuzz>=3.0.0",
 ]
Or if they should remain optional, remove them from the main dependencies list.
Also applies to: 55-59
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pyproject.toml` around lines 32 - 36, The pyproject lists sacrebleu,
rouge-score, and rapidfuzz in both the top-level dependencies and the
"nlp-metrics" optional group; decide whether these packages are required or
optional and remove the duplicates accordingly—if they are always required,
delete them from the "nlp-metrics" optional group; if they are optional, remove
them from the top-level dependencies and keep them only under the "nlp-metrics"
extra; ensure the three package specifiers (sacrebleu>=2.0.0,
rouge-score>=0.1.2, rapidfuzz>=3.0.0) are present in only one place and update
any docs or install instructions to match.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@src/lightspeed_evaluation/core/llm/litellm_patch.py`:
- Around line 84-98: The setup_litellm_ssl function ignores
llm_params["ssl_cert_file"] and never applies a custom CA bundle; update
setup_litellm_ssl to read ssl_cert_file from llm_params (e.g., ssl_cert_file =
llm_params.get("ssl_cert_file")), and when ssl_verify is truthy set
litellm.ssl_verify to the custom path if provided, otherwise fall back to
os.environ.get("SSL_CERTIFI_BUNDLE") (and ensure the fallback default is a
boolean or path as appropriate); keep the existing behavior for disabling
verification (litellm.ssl_verify = False) and continue to set
litellm.drop_params = True.
- Around line 35-40: The code currently suppresses multiple linters around
interacting with litellm internals (litellm.suppress_debug_info,
litellm._async_success_callback, litellm._async_failure_callback,
litellm._async_input_callback) and silences other checks for _NoOpLoggingWorker
and a logging-worker replacement; instead of blanket disables, either use public
APIs or add a short justificatory comment above each suppression explaining why
direct access is required (e.g., disabling litellm async logging) and limit the
scope of the disable to the minimum lines, and for track_tokens replace the
broad "except Exception" handlers with specific exception classes that you
expect (or re-raise after handling) in the functions/methods named track_tokens
so linting isn’t suppressed; update comments near _NoOpLoggingWorker and the
logging-worker assignment to document why a lightweight stub is needed and
remove unrelated noqa/type:ignore where possible.

In `@src/lightspeed_evaluation/core/metrics/ragas.py`:
- Line 14: The code imports and uses the legacy ContextUtilization metric;
replace that import with ContextPrecisionWithoutReference from
ragas.metrics.collections and update any instantiation/usages of
ContextUtilization to use ContextPrecisionWithoutReference instead (e.g., change
the import at the top that currently lists ContextUtilization and replace the
symbol used around the ContextUtilization usage near the metric creation at line
~172 so the code constructs/refs ContextPrecisionWithoutReference).

---

Nitpick comments:
In `@pyproject.toml`:
- Around line 32-36: The pyproject lists sacrebleu, rouge-score, and rapidfuzz
in both the top-level dependencies and the "nlp-metrics" optional group; decide
whether these packages are required or optional and remove the duplicates
accordingly—if they are always required, delete them from the "nlp-metrics"
optional group; if they are optional, remove them from the top-level
dependencies and keep them only under the "nlp-metrics" extra; ensure the three
package specifiers (sacrebleu>=2.0.0, rouge-score>=0.1.2, rapidfuzz>=3.0.0) are
present in only one place and update any docs or install instructions to match.

In `@src/lightspeed_evaluation/core/embedding/ragas.py`:
- Around line 30-44: The provider mapping in the provider -> model_str branch
(handling "azure", "vertex", "bedrock") is unreachable because EmbeddingConfig
currently restricts allowed providers to {"openai","huggingface","gemini"};
either remove those unreachable branches (the elifs for "azure", "vertex",
"bedrock") from the provider -> model_str logic, or extend
EmbeddingConfig._validate_provider to include the additional providers
("azure","vertex_ai"/"vertex","bedrock") so validation permits them; update any
provider naming conventions so the string produced by the provider -> model_str
block matches the canonical provider identifiers used by _validate_provider
(referencing the provider variable and the model_str assignment logic).

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 50128289-9542-4b54-ba67-8dc922191825

📥 Commits

Reviewing files that changed from the base of the PR and between ba42d06 and 8146522.

⛔ Files ignored due to path filters (2)

uv-gpu.lock is excluded by !**/*.lock
uv.lock is excluded by !**/*.lock

📒 Files selected for processing (12)

pyproject.toml
src/lightspeed_evaluation/core/constants.py
src/lightspeed_evaluation/core/embedding/ragas.py
src/lightspeed_evaluation/core/llm/custom.py
src/lightspeed_evaluation/core/llm/litellm_patch.py
src/lightspeed_evaluation/core/llm/ragas.py
src/lightspeed_evaluation/core/metrics/geval.py
src/lightspeed_evaluation/core/metrics/nlp.py
src/lightspeed_evaluation/core/metrics/ragas.py
tests/unit/core/llm/test_custom.py
tests/unit/core/metrics/conftest.py
tests/unit/core/metrics/test_nlp.py

🚧 Files skipped from review as they are similar to previous changes (2)

src/lightspeed_evaluation/core/metrics/nlp.py
src/lightspeed_evaluation/core/metrics/geval.py

coderabbitai

♻️ Duplicate comments (2)

src/lightspeed_evaluation/core/llm/litellm_patch.py (2)

90-95: ⚠️ Potential issue | 🟠 Major

setup_litellm_ssl still ignores ssl_cert_file from config.

This path applies only ssl_verify and environment fallback, so explicitly configured CA bundle paths in runtime config still won’t be honored.

🔧 Suggested fix

 def setup_litellm_ssl(llm_params: dict[str, Any]) -> None:
@@
-    ssl_verify = llm_params.get("ssl_verify", True)
+    ssl_verify = llm_params.get("ssl_verify", True)
+    ssl_cert_file = llm_params.get("ssl_cert_file")
@@
-    if ssl_verify:
-        litellm.ssl_verify = os.environ.get("SSL_CERTIFI_BUNDLE", True)
+    if ssl_verify:
+        litellm.ssl_verify = (
+            ssl_cert_file
+            or os.environ.get("SSL_CERT_FILE")
+            or os.environ.get("REQUESTS_CA_BUNDLE")
+            or os.environ.get("SSL_CERTIFI_BUNDLE")
+            or True
+        )
     else:
         litellm.ssl_verify = False

#!/bin/bash
# Verify ssl_cert_file is modeled and whether setup_litellm_ssl consumes it.
fd -t f 'constants.py|system.py|litellm_patch.py' src \
  | xargs rg -n "ssl_cert_file|setup_litellm_ssl|ssl_verify|SSL_CERTIFI_BUNDLE|SSL_CERT_FILE|REQUESTS_CA_BUNDLE"

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/lightspeed_evaluation/core/llm/litellm_patch.py` around lines 90 - 95,
The setup_litellm_ssl code only uses ssl_verify and the SSL_CERTIFI_BUNDLE env
var, so it ignores a configured ssl_cert_file; update setup_litellm_ssl to check
llm_params.get("ssl_cert_file") (or equivalent config field) and, when present
and ssl_verify is true, set litellm.ssl_verify to that file path instead of the
env fallback, otherwise fall back to os.environ.get("SSL_CERTIFI_BUNDLE") and
finally to False when ssl_verify is false; modify the block that currently
writes to litellm.ssl_verify (in setup_litellm_ssl) to implement this
precedence: explicit ssl_cert_file -> SSL_CERTIFI_BUNDLE env var -> True/False
as appropriate.

35-40: ⚠️ Potential issue | 🟠 Major

Remove inline suppression directives from this production module.

The new # pylint: disable, # type: ignore, and # noqa directives are policy violations in src/** and should be replaced by code changes that satisfy lint/type/import-order checks.

As per coding guidelines: **/*.py: Do not disable lint warnings with # noqa, # type: ignore, or # pylint: disable comments - fix the underlying issue instead.

Also applies to: 73-79

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/lightspeed_evaluation/core/llm/litellm_patch.py` around lines 35 - 40,
Remove the inline pylint suppression and instead set the module attributes using
safe, lint-friendly operations: delete the "# pylint: disable=protected-access"
and "# pylint: enable=protected-access" comments and replace the direct
protected-member assignments with setattr calls on the litellm module (e.g.,
setattr(litellm, "suppress_debug_info", True), setattr(litellm,
"_async_success_callback", []), setattr(litellm, "_async_failure_callback", []),
setattr(litellm, "_async_input_callback", [])); apply the same pattern for the
similar block at lines 73-79 so no "# type: ignore"/"# noqa"/pylint disables are
used and protected-access warnings are avoided.

🧹 Nitpick comments (2)

src/lightspeed_evaluation/core/embedding/ragas.py (1)

30-44: Align provider branching with validated provider set to reduce dead paths.

EmbeddingConfig currently validates only openai, huggingface, and gemini, so azure/vertex/bedrock branches are unreachable today. Consider collapsing this into a mapping for current providers (and keeping a single explicit fallback warning path).

♻️ Suggested cleanup

-        if provider == "openai":
-            model_str = model  # OpenAI models don't need prefix
-        elif provider == "huggingface":
-            model_str = f"huggingface/{model}"
-        elif provider == "gemini":
-            model_str = f"gemini/{model}"
-        elif provider == "azure":
-            model_str = f"azure/{model}"
-        elif provider == "vertex":
-            model_str = f"vertex_ai/{model}"
-        elif provider == "bedrock":
-            model_str = f"bedrock/{model}"
-        else:
-            # Generic litellm format
-            model_str = f"{provider}/{model}"
+        provider_prefix = {
+            "openai": "",
+            "huggingface": "huggingface/",
+            "gemini": "gemini/",
+        }
+        prefix = provider_prefix.get(provider)
+        if prefix is None:
+            logger.warning(
+                "Unexpected embedding provider '%s'; using generic litellm prefix.",
+                provider,
+            )
+            model_str = f"{provider}/{model}"
+        else:
+            model_str = f"{prefix}{model}"

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/lightspeed_evaluation/core/embedding/ragas.py` around lines 30 - 44, The
provider branching in ragas.py creates unreachable branches
(azure/vertex/bedrock) because EmbeddingConfig only validates openai,
huggingface, and gemini; replace the long if/elif chain that sets model_str with
a provider->format mapping for the validated providers (e.g., mapping for
"openai", "huggingface", "gemini") and use a single fallback branch that
logs/warns and formats as f"{provider}/{model}" for any unexpected provider;
update the code that assigns model_str (the block referencing variables provider
and model) to use the mapping lookup and the single explicit fallback so dead
paths are removed and future providers are handled consistently.

pyproject.toml (1)

32-35: Redundant NLP metrics declarations.

The same packages (sacrebleu, rouge-score, rapidfuzz) are declared both in main dependencies (lines 32-35) and in the nlp-metrics optional group (lines 55-58). Since main dependencies are always installed, the optional group becomes redundant.

Either remove these packages from main dependencies if they should truly be optional, or remove the nlp-metrics optional group if they're now required.

Also applies to: 55-58
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pyproject.toml` around lines 32 - 35, Remove the duplicate NLP metric
packages from one place: decide whether sacrebleu, rouge-score, and rapidfuzz
should be required or optional, then update pyproject.toml accordingly — either
delete the three package entries ("sacrebleu>=2.0.0", "rouge-score>=0.1.2",
"rapidfuzz>=3.0.0") from the main dependencies section if you want them only in
the optional extras group named nlp-metrics, or remove the nlp-metrics extra
(and its entries) if they should remain required; make sure to update only the
entries for those package names and keep the rest of the dependency lists
unchanged.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@src/lightspeed_evaluation/core/llm/litellm_patch.py`:
- Around line 90-95: The setup_litellm_ssl code only uses ssl_verify and the
SSL_CERTIFI_BUNDLE env var, so it ignores a configured ssl_cert_file; update
setup_litellm_ssl to check llm_params.get("ssl_cert_file") (or equivalent config
field) and, when present and ssl_verify is true, set litellm.ssl_verify to that
file path instead of the env fallback, otherwise fall back to
os.environ.get("SSL_CERTIFI_BUNDLE") and finally to False when ssl_verify is
false; modify the block that currently writes to litellm.ssl_verify (in
setup_litellm_ssl) to implement this precedence: explicit ssl_cert_file ->
SSL_CERTIFI_BUNDLE env var -> True/False as appropriate.
- Around line 35-40: Remove the inline pylint suppression and instead set the
module attributes using safe, lint-friendly operations: delete the "# pylint:
disable=protected-access" and "# pylint: enable=protected-access" comments and
replace the direct protected-member assignments with setattr calls on the
litellm module (e.g., setattr(litellm, "suppress_debug_info", True),
setattr(litellm, "_async_success_callback", []), setattr(litellm,
"_async_failure_callback", []), setattr(litellm, "_async_input_callback", []));
apply the same pattern for the similar block at lines 73-79 so no "# type:
ignore"/"# noqa"/pylint disables are used and protected-access warnings are
avoided.

---

Nitpick comments:
In `@pyproject.toml`:
- Around line 32-35: Remove the duplicate NLP metric packages from one place:
decide whether sacrebleu, rouge-score, and rapidfuzz should be required or
optional, then update pyproject.toml accordingly — either delete the three
package entries ("sacrebleu>=2.0.0", "rouge-score>=0.1.2", "rapidfuzz>=3.0.0")
from the main dependencies section if you want them only in the optional extras
group named nlp-metrics, or remove the nlp-metrics extra (and its entries) if
they should remain required; make sure to update only the entries for those
package names and keep the rest of the dependency lists unchanged.

In `@src/lightspeed_evaluation/core/embedding/ragas.py`:
- Around line 30-44: The provider branching in ragas.py creates unreachable
branches (azure/vertex/bedrock) because EmbeddingConfig only validates openai,
huggingface, and gemini; replace the long if/elif chain that sets model_str with
a provider->format mapping for the validated providers (e.g., mapping for
"openai", "huggingface", "gemini") and use a single fallback branch that
logs/warns and formats as f"{provider}/{model}" for any unexpected provider;
update the code that assigns model_str (the block referencing variables provider
and model) to use the mapping lookup and the single explicit fallback so dead
paths are removed and future providers are handled consistently.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 8df21ba8-a77d-4489-94f8-c4833424f0b2

📥 Commits

Reviewing files that changed from the base of the PR and between 8146522 and 7b312ef.

⛔ Files ignored due to path filters (2)

uv-gpu.lock is excluded by !**/*.lock
uv.lock is excluded by !**/*.lock

📒 Files selected for processing (12)

pyproject.toml
src/lightspeed_evaluation/core/constants.py
src/lightspeed_evaluation/core/embedding/ragas.py
src/lightspeed_evaluation/core/llm/custom.py
src/lightspeed_evaluation/core/llm/litellm_patch.py
src/lightspeed_evaluation/core/llm/ragas.py
src/lightspeed_evaluation/core/metrics/geval.py
src/lightspeed_evaluation/core/metrics/nlp.py
src/lightspeed_evaluation/core/metrics/ragas.py
tests/unit/core/llm/test_custom.py
tests/unit/core/metrics/conftest.py
tests/unit/core/metrics/test_nlp.py

🚧 Files skipped from review as they are similar to previous changes (1)

tests/unit/core/metrics/test_nlp.py

asamal4 reviewed Mar 16, 2026

View reviewed changes

src/lightspeed_evaluation/core/llm/ragas.py Outdated Show resolved Hide resolved

VladimirKadlec requested changes Mar 16, 2026

View reviewed changes

src/lightspeed_evaluation/core/metrics/nlp.py Outdated Show resolved Hide resolved

asamal4 reviewed Mar 16, 2026

View reviewed changes

pyproject.toml Outdated Show resolved Hide resolved

coderabbitai bot reviewed Mar 16, 2026

View reviewed changes

bsatapat-jpg force-pushed the dev branch 2 times, most recently from f1042f0 to ba42d06 Compare March 17, 2026 12:36

coderabbitai bot reviewed Mar 17, 2026

View reviewed changes

src/lightspeed_evaluation/core/llm/ragas.py Show resolved Hide resolved

src/lightspeed_evaluation/core/metrics/geval.py Show resolved Hide resolved

bsatapat-jpg force-pushed the dev branch from ba42d06 to 8146522 Compare March 17, 2026 13:22

coderabbitai bot reviewed Mar 17, 2026

View reviewed changes

[LEADS-232] Bump up dependencies for Python 3.11, 3.12, 3.13 support

7b312ef

bsatapat-jpg force-pushed the dev branch from 8146522 to 7b312ef Compare March 17, 2026 13:44

coderabbitai bot reviewed Mar 17, 2026

View reviewed changes

Conversation

bsatapat-jpg commented Mar 13, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Type of change

Tools used to create PR

Related Tickets & Documents

Checklist before requesting a review

Testing

Summary by CodeRabbit

Uh oh!

asamal4 left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

asamal4 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

asamal4 commented Mar 16, 2026

Uh oh!

coderabbitai bot commented Mar 16, 2026

Uh oh!

coderabbitai bot commented Mar 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

bsatapat-jpg commented Mar 17, 2026

Uh oh!

coderabbitai bot commented Mar 17, 2026

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

bsatapat-jpg commented Mar 13, 2026 •

edited by coderabbitai bot

Loading

asamal4 left a comment •

edited

Loading

coderabbitai bot commented Mar 16, 2026 •

edited

Loading