Feat/keycloak by robodev-r2d2 · Pull Request #53 · robodev-r2d2/rag-template

robodev-r2d2 · 2026-02-12T10:35:45Z

Thank you for contributing to the RAG Core Library!

Please ensure your PR meets the following requirements:

PR Title: Follow the format "type: description"
- Refer to the conventional commits specification for more details.
PR Description: Replace this checklist with:
- Description: Provide a detailed description of the changes made.
- Issue: Mention the issue number this PR fixes, if applicable.
- Dependencies: List any dependencies required for this change.

Additional Guidelines:

Ensure your code follows established coding conventions
Include relevant tests and documentation updates.
If no one reviews your PR within a few days, please @-mention a-klos.

Thank you for your contribution!

- Added `setup_keycloak.py` script for automating Keycloak realm and client setup. - Introduced logging for better traceability during Keycloak operations. - Updated `pyproject.toml` to include `python-keycloak` dependency. - Created empty logging and sys files for future enhancements.

…uration

…G configuration

…t and update README for configuration details

- Add KnowledgeSpaceSettings for configuration of knowledge spaces and multitenancy strategies. - Introduce KnowledgeSpaceAccessService for authorization and scope resolution of knowledge spaces. - Create KnowledgeSpaceCollectionRouter to map logical knowledge spaces to physical collections. - Define KnowledgeSpace and DocumentVisibilityMetadata models for knowledge space representation. - Implement operational tools for managing knowledge spaces, including state loading and saving. - Add migration script for backfilling legacy metadata in Qdrant collections. - Develop tests for knowledge space access service, settings validation, and operational tools.

+
+    def _build_trusted_issuers(self) -> set[str]:
+        trusted = self._default_trusted_issuers() | self._configured_trusted_issuers()
+        logger.info("Configured trusted token issuers: %s", sorted(trusted))


In general, to fix clear-text logging issues, we should avoid logging sensitive or potentially sensitive values directly. Instead, we can either remove such logs, reduce their granularity (e.g., log only counts or high-level status), or redact/obfuscate the sensitive parts before logging. The fix must preserve existing functional behavior (authentication logic) while changing only what is sent to the logger.

For this specific case, the problematic code is in _build_trusted_issuers in libs/rag-core-api/src/rag_core_api/auth.py, at line 93–95. The function currently logs the exact list of trusted issuers with: logger.info("Configured trusted token issuers: %s", sorted(trusted)). To eliminate the clear-text logging of the issuers while preserving usefulness, we can either (a) remove the dynamic value entirely and log only that trusted issuers were configured, or (b) log only aggregate information such as the number of configured issuers. Option (b) still gives operators some visibility without exposing actual issuer values, and is fully compatible with existing logic, since the returned set trusted remains unchanged and used as before. Concretely, we will replace the existing log call with something like: logger.info("Configured trusted token issuers: %d issuers", len(trusted)). No new imports or helper methods are needed; we only modify that one logging line.

+                    "JWKS signature failed (kid=%s, alg=%s, source=%s), refreshing and retrying",
+                    kid,
+                    alg,
+                    name,


General fix: ensure that log messages never include sensitive or untrusted values unless they are first redacted or otherwise sanitized. In this case, we want to keep the diagnostic value of knowing which JWKS source failed (issuer vs. default, and whether it was a refreshed set) while guaranteeing that no tainted issuer or URL data is written to logs. That means ensuring the source field in the log message is always a static, non-sensitive label.

Best concrete fix here: avoid using a possibly tainted name derived from jwks_sources directly in the log call. Instead, introduce a local source_label that is computed in a controlled, constant way (e.g., mapping the known sources to "issuer" / "issuer-refreshed" / "default" / "default-refreshed"), and log that label instead of name. Since in the current code name is already a constant string literal when the tuples are created, we can simply replace the placeholder %s argument from name to a new, clearly non-tainted local that is derived from those known constants, not any issuer-derived value.

Concretely, in libs/rag-core-api/src/rag_core_api/auth.py, in the loop starting at line 230, change the logger.info call at lines 236–241 so that it does not log the potentially tainted name. Introduce a source_label computed from name (e.g., source_label = "issuer" if name.startswith("issuer") else "default" or simply source_label = name.split("-")[0]), and then pass source_label instead of name to logger.info. This keeps functionality effectively the same (we still distinguish issuer/default, and refreshed vs not can be inferred from context or by tweaking the label logic if needed) while structurally eliminating any propagation of taint into the logging sink. No new imports or external methods are required.

            except ValueError:
                data = response_text
-        elif re.match(r"^application/(json|[\w!#$&.+-^_]+\+json)\s*(;|$)", content_type, re.IGNORECASE):
+        elif re.match(r'^application/(json|[\w!#$&.+-^_]+\+json)\s*(;|$)', content_type, re.IGNORECASE):


…ensitive information Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>

…eware

+                self._assert_trusted_issuer(decoded.get("iss"))
+                return decoded
+            except InvalidSignatureError as sig_err:
+                logger.info("JWKS signature failed (source=%s), refreshing and retrying", name)


NewDev16 and others added 13 commits December 3, 2025 21:44

feat: Enhance authentication and security features across API endpoints

1d8b240

feat: Integrate Keycloak for RAG configuration access token retrieval

c60f950

feat: Update Keycloak integration and configuration across services

70632a4

feat: Implement lazy access token provider for Keycloak in RAG config…

09a87b9

…uration

feat: Update Keycloak client secret and enhance token retrieval in RA…

2f6d2fb

…G configuration

feat: Enhance Keycloak integration with context-based token managemen…

71485ed

…t and update README for configuration details

feat: Refactor chat and admin app to use runtime configuration for URLs

faa8b78

Merge branch 'main' into feat/keycloak

2c05a31

generate lock files

3521d98

fix: add tasks.md to .gitignore

2cec04f

docs: Update documentation for using existing Keycloak deployment

dc21429

github-advanced-security AI found potential problems Feb 12, 2026

View reviewed changes

robodev-r2d2 and others added 2 commits February 13, 2026 16:39

Potential fix for code scanning alert no. 24: Clear-text logging of s…

a2f98ee

…ensitive information Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>

fix: Simplify logging message for JWKS signature failure in AuthMiddl…

e0c61e8

…eware

github-advanced-security AI found potential problems Feb 13, 2026

View reviewed changes

Merge branch 'main' into feat/keycloak

7649488

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feat/keycloak#53

Feat/keycloak#53
robodev-r2d2 wants to merge 16 commits into
mainfrom
feat/keycloak

robodev-r2d2 commented Feb 12, 2026

Uh oh!

Uh oh!

Uh oh!

Check failure

Copilot Autofix

Check failure

Copilot Autofix

Check warning

Check failure

Copilot Autofix

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

@@ -91,7 +91,7 @@
                 def _build_trusted_issuers(self) -> set[str]:
                     trusted = self._default_trusted_issuers() | self._configured_trusted_issuers()
-                    logger.info("Configured trusted token issuers: %s", sorted(trusted))
+                    logger.info("Configured trusted token issuers: %d issuers", len(trusted))
                     return trusted
                 def _is_trusted_issuer(self, issuer: str | None) -> bool:

Conversation

robodev-r2d2 commented Feb 12, 2026

Uh oh!

Uh oh!

Uh oh!

Check failure

Uh oh!

Copilot Autofix

Check failure

Uh oh!

Uh oh!

Copilot Autofix

Check warning

Check failure

Uh oh!

Uh oh!

Copilot Autofix

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants