Skip to content

New prompts#34

Open
bharathappali wants to merge 3 commits intocausaai:poc-mar26from
bharathappali:new-prompts
Open

New prompts#34
bharathappali wants to merge 3 commits intocausaai:poc-mar26from
bharathappali:new-prompts

Conversation

@bharathappali
Copy link
Copy Markdown
Contributor

@bharathappali bharathappali commented Mar 30, 2026

This PR is built on top of PR #33

Please review this PR after merging the #33

Summary by Sourcery

Introduce a richer, structured root cause analysis experience with new LLM prompts, parsing logic, and UI layout to surface severity, confidence, evidence, symptoms, and affected services.

New Features:

  • Add structured fields to RCA reports for severity, confidence, supporting evidence bullets, observable symptoms, and affected services, extracted directly from the RCA analyst LLM output.
  • Update the analysis details page to present root cause analysis in a side-by-side layout with badges, tags, and evidence sections tailored to the new structured RCA data.
  • Extend RCA analyst and GC pause detector prompt definitions to enforce stricter, richer output formats including severity, supporting evidence, observable symptoms, and affected services.

Enhancements:

  • Refine evidence extraction in the orchestrator to rely on structured sections from the RCA analyst instead of reconstructing evidence from matched logs.
  • Improve example prompts and sample outputs in assertion/evidence agents to use generic container placeholders for better reusability.

shekhar316 and others added 3 commits March 30, 2026 13:35
Signed-off-by: Shekhar Saxena <shekhar.saxena@ibm.com>
Signed-off-by: bharathappali <abharath@redhat.com>
Signed-off-by: bharathappali <abharath@redhat.com>
@sourcery-ai
Copy link
Copy Markdown

sourcery-ai Bot commented Mar 30, 2026

Reviewer's Guide

Adds structured RCA output fields from the LLM, wires them through the orchestrator and report model, and introduces a new two-column RCA overview UI with severity, confidence, symptoms, affected services, and supporting evidence, plus updated prompting for RCA and GC pause detection and minor placeholder text tweaks.

Sequence diagram for structured RCA extraction and UI rendering

sequenceDiagram
    actor User
    participant Browser
    participant Backend
    participant RcaOrchestrator
    participant RootCauseAnalyst
    participant GcPauseDetector

    User->>Browser: Request analysis_details page
    Browser->>Backend: HTTP GET /analysis_details?sessionId
    Backend->>RcaOrchestrator: runAnalysisInternal(sessionId, namespace, podName)

    RcaOrchestrator->>RootCauseAnalyst: analyzeRootCause(anomalyType, llmContext)
    RootCauseAnalyst-->>RcaOrchestrator: rcaOutput (structured text)

    RcaOrchestrator->>GcPauseDetector: detectGcPause(summarizedLogs)
    GcPauseDetector-->>RcaOrchestrator: gcResult

    RcaOrchestrator->>RcaOrchestrator: extractRootCause(rcaOutput)
    RcaOrchestrator->>RcaOrchestrator: extractKeyEvidence(rcaOutput)
    RcaOrchestrator->>RcaOrchestrator: extractSupportedLogs(rcaOutput)
    RcaOrchestrator->>RcaOrchestrator: extractSeverity(rcaOutput)
    RcaOrchestrator->>RcaOrchestrator: extractSupportingEvidenceBullets(rcaOutput)
    RcaOrchestrator->>RcaOrchestrator: extractObservableSymptoms(rcaOutput)
    RcaOrchestrator->>RcaOrchestrator: extractAffectedServices(rcaOutput)

    RcaOrchestrator->>RcaOrchestrator: build finalReport map with structured fields
    RcaOrchestrator-->>Backend: RcaReport (including severity, confidenceLevel, supportingEvidenceBullets, observableSymptoms, affectedServices)

    Backend-->>Browser: Render analysisDetails.html with session.report
    Browser-->>User: Show two column RCA overview (main issue and supporting evidence)
Loading

Updated class diagram for RCA orchestration and report model

classDiagram
    class RcaOrchestrator {
        -Logger LOG
        -ObjectMapper mapper
        +void runAnalysisInternal(String sessionId, String namespace, String podName)
        -String extractRootCause(String rcaOutput)
        -List~String~ extractSupportedLogs(String rcaOutput)
        -String extractKeyEvidence(String rcaOutput)
        -String parseGcAnomalyType(String raw)
        -String extractSeverity(String rcaOutput)
        -List~String~ extractSupportingEvidenceBullets(String rcaOutput)
        -List~String~ extractObservableSymptoms(String rcaOutput)
        -List~String~ extractAffectedServices(String rcaOutput)
    }

    class RootCauseAnalyst {
        <<interface>>
        +String analyzeRootCause(String anomalyType, String llmContext)
    }

    class GcPauseDetector {
        <<interface>>
        +String detectGcPause(String summarizedLogs)
    }

    class RcaReport {
        +String title
        +String issue
        +String highLevelIssue
        +String subLevelIssue
        +String anomalyType
        +List~String~ validationChecks
        +List~AssertionItem~ assertions
        +FinalDecision finalDecision
        +String severity
        +String confidenceLevel
        +List~String~ supportingEvidenceBullets
        +List~String~ observableSymptoms
        +List~String~ affectedServices
    }

    class AssertionItem
    class FinalDecision

    RcaOrchestrator --> RootCauseAnalyst : uses
    RcaOrchestrator --> GcPauseDetector : uses
    RcaOrchestrator --> RcaReport : populates
    RcaReport "*" --> "*" AssertionItem : contains
    RcaReport --> FinalDecision : has
Loading

File-Level Changes

Change Details Files
Parse new structured RCA fields from LLM output and include them in the final report instead of deriving evidence from matched logs.
  • Add helper methods to extract supported logs, key evidence, severity, supporting evidence bullets, observable symptoms, and affected services from the RCA Analyst text output using regex-based parsing with defensive defaults.
  • Call the new extractors early in runAnalysisInternal, log the extracted values, and stop rebuilding evidence/supportedLogs from matched assertion logs.
  • Augment the finalReport map with severity, confidenceLevel (based on validation status), supportingEvidenceBullets, observableSymptoms, and affectedServices so they are available to the UI.
src/main/java/com/causa/rca/service/RcaOrchestrator.java
Extend the RCA LLM prompt and GC pause detector prompt to produce richer, strictly structured outputs including severity, evidence, symptoms, and OOM-risk-aware GC explanations.
  • Update RootCauseAnalyst prompt to require a fixed, extended structure including ROOT_CAUSE_TITLE, SEVERITY, ROOT_CAUSE, SUPPORTING_EVIDENCE, OBSERVABLE_SYMPTOMS, AFFECTED_SERVICES, KEY_EVIDENCE, and SUPPORTED_LOGS with explicit bullet/count rules.
  • Clarify rules around evidence being signal-based only, require specific bullet counts and ordering, and stress extraction of real service/container names for AFFECTED_SERVICES.
  • Expand GcPauseDetector system prompt to include OOM risk assessment, detailed detection criteria, and stricter output structure for ANOMALY_TYPE and EXPLANATION.
src/main/java/com/causa/rca/ai/RootCauseAnalyst.java
src/main/java/com/causa/rca/ai/GcPauseDetector.java
Add new structured RCA fields to the RcaReport model to support the richer UI and report semantics.
  • Introduce severity, confidenceLevel, supportingEvidenceBullets, observableSymptoms, and affectedServices fields on RcaReport.
  • Document intended usage of these fields in comments (severity level, confidence text, evidence bullets, symptom metrics, and affected services list).
src/main/java/com/causa/rca/model/RcaReport.java
Redesign the RCA section of the analysis details UI into a two-column layout that surfaces severity, confidence, affected services, observable symptoms, and supporting evidence, with new styling.
  • Replace the prior tabbed RCA/High-Level/Detailed issues layout with a single RCA overview grid: left column for the main issue, severity/confidence badges, description, affected services chips, and observable symptom list with contextual icons; right column for supporting evidence bullets and an informational callout about captured JVM data.
  • Remove now-unused hidden highLevelIssue/subLevelIssue clipboard fields and tab navigation, while keeping hidden issue title/description.
  • Add new CSS for the RCA overview grid, cards, badges, tags, symptom/evidence items, and info box, including responsive behavior for smaller viewports.
src/main/resources/templates/analysisDetails.html
src/main/resources/META-INF/resources/css/dashboard.css
Tighten examples and placeholder text in AI agents to avoid hard-coding specific container names.
  • Update EvidenceMatcherAgent example matchedLogs to use a placeholder container name instead of a concrete one.
  • Adjust RcaAssertionExtractor example root cause sentence to match the placeholder container naming convention.
src/main/java/com/causa/rca/ai/EvidenceMatcherAgent.java
src/main/java/com/causa/rca/ai/RcaAssertionExtractor.java

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

Copy link
Copy Markdown

@sourcery-ai sourcery-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 2 issues, and left some high level feedback:

  • In RcaOrchestrator, the new extract* helpers all parse sections from rcaOutput with very similar regex boundaries; consider centralizing shared parsing logic or at least the section delimiters so that future format changes to the LLM output require updates in only one place and reduce the risk of subtle mismatches.
  • The new RCA HTML layout drops the previous highLevelIssue and subLevelIssue tabbed views entirely; if those fields are still populated server-side, consider preserving them somewhere in the new UI (e.g., as expandable sections) to avoid losing potentially valuable context for users.
  • The additional logging in runAnalysisInternal uses string concatenation for LOG.info; switching to parameterized logging (e.g., LOG.info("Extracted {} supported logs from RCA Analyst", supportedLogs.size());) will avoid unnecessary string construction when the log level is lower and keep log statements consistent with typical logging best practices.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- In RcaOrchestrator, the new `extract*` helpers all parse sections from `rcaOutput` with very similar regex boundaries; consider centralizing shared parsing logic or at least the section delimiters so that future format changes to the LLM output require updates in only one place and reduce the risk of subtle mismatches.
- The new RCA HTML layout drops the previous `highLevelIssue` and `subLevelIssue` tabbed views entirely; if those fields are still populated server-side, consider preserving them somewhere in the new UI (e.g., as expandable sections) to avoid losing potentially valuable context for users.
- The additional logging in `runAnalysisInternal` uses string concatenation for `LOG.info`; switching to parameterized logging (e.g., `LOG.info("Extracted {} supported logs from RCA Analyst", supportedLogs.size());`) will avoid unnecessary string construction when the log level is lower and keep log statements consistent with typical logging best practices.

## Individual Comments

### Comment 1
<location path="src/main/java/com/causa/rca/service/RcaOrchestrator.java" line_range="526-527" />
<code_context>
+        }
+
+        // Extract SUPPORTED LOGS section
+        Pattern pattern = Pattern.compile(
+                "(?i)SUPPORTED[_ ]LOGS\\s*:\\s*(.*?)(?=\\n\\n|$)",
+                Pattern.DOTALL);
+
</code_context>
<issue_to_address>
**issue (bug_risk):** SUPPORTED_LOGS regex can accidentally consume following sections and treat them as logs.

Because the pattern only stops at a blank line or end-of-string, a section header on the very next line (e.g. `KEY_EVIDENCE:`) will be captured into `logsSection` and may be treated as a supported log if it passes the length filter. Please tighten the lookahead to stop at section headers as well, e.g. `(?=\n[A-Z_ ]+\s*:|\n\n|$)`, or at least include known headers like `KEY[_ ]EVIDENCE` in the lookahead.
</issue_to_address>

### Comment 2
<location path="src/main/java/com/causa/rca/service/RcaOrchestrator.java" line_range="532-541" />
<code_context>
+
+        Matcher matcher = pattern.matcher(rcaOutput);
+
+        if (matcher.find()) {
+            String logsSection = matcher.group(1).trim();
+            
</code_context>
<issue_to_address>
**issue (bug_risk):** Severity extraction is very strict and may silently fall back to Medium for slightly noisy outputs.

The parser only accepts exact `High|Medium|Low` matches, so outputs like `High - user impact` or `High.` will fail the regex and silently fall back to `"Medium"`. To reduce misclassification risk, normalize the extracted value first (e.g., take the first word and/or strip non-letter characters) before validating against the allowed severities.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment on lines +526 to +527
Pattern pattern = Pattern.compile(
"(?i)SUPPORTED[_ ]LOGS\\s*:\\s*(.*?)(?=\\n\\n|$)",
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (bug_risk): SUPPORTED_LOGS regex can accidentally consume following sections and treat them as logs.

Because the pattern only stops at a blank line or end-of-string, a section header on the very next line (e.g. KEY_EVIDENCE:) will be captured into logsSection and may be treated as a supported log if it passes the length filter. Please tighten the lookahead to stop at section headers as well, e.g. (?=\n[A-Z_ ]+\s*:|\n\n|$), or at least include known headers like KEY[_ ]EVIDENCE in the lookahead.

Comment on lines +532 to +541
if (matcher.find()) {
String logsSection = matcher.group(1).trim();

// Skip if it says "no direct supported logs present"
if (logsSection.toLowerCase().contains("no direct supported logs present")) {
LOG.info("RCA Analyst indicated no direct supported logs present");
return logs;
}

// Split by newlines and filter out empty lines
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (bug_risk): Severity extraction is very strict and may silently fall back to Medium for slightly noisy outputs.

The parser only accepts exact High|Medium|Low matches, so outputs like High - user impact or High. will fail the regex and silently fall back to "Medium". To reduce misclassification risk, normalize the extracted value first (e.g., take the first word and/or strip non-letter characters) before validating against the allowed severities.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants