Skip to content

feat: Add end-to-end integration tests for LogGuardian#189

Open
modupei wants to merge 6 commits intomainfrom
issues/11-e2e-test
Open

feat: Add end-to-end integration tests for LogGuardian#189
modupei wants to merge 6 commits intomainfrom
issues/11-e2e-test

Conversation

@modupei
Copy link
Copy Markdown

@modupei modupei commented Nov 27, 2025

Adds a new directory e2e_tests/ containing a Python script for end-to-end integration testing of LogGuardian. This script automates the deployment of LogGuardian to a real AWS environment, runs comprehensive test scenarios (non-compliant, compliant, and remediation of incorrect policies), and performs full cleanup of all created resources.

This addresses the need for validating LogGuardian's functionality in a live AWS environment.

Description

Brief description of changes

Checklist

  • Tests pass
  • Documentation updated
  • Breaking changes noted

Testing

How was this tested?

Summary by CodeRabbit

  • Tests
    • Added comprehensive multi-region end-to-end tests that deploy the solution, provision per-run resources to avoid collisions, execute multiple remediation and validation scenarios (including performance and invalid-payload cases), log progress with robust error handling, and perform automated cleanup of deployed resources.
  • Chores
    • Pinned Go toolchain metadata to a specific patch version for build consistency.

✏️ Tip: You can customize this high-level summary in your review settings.

Adds a new directory e2e_tests/ containing a Python script for end-to-end integration testing of LogGuardian. This script automates the deployment of LogGuardian to a real AWS environment, runs comprehensive test scenarios (non-compliant, compliant, and remediation of incorrect policies), and performs full cleanup of all created resources.

This addresses the need for validating LogGuardian's functionality in a live AWS environment.
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Nov 27, 2025

Walkthrough

Adds a Python end-to-end test runner that packages and deploys the SAM template per region, provisions an S3 bucket, invokes the deployed Lambda across multiple CloudWatch Logs scenarios to verify retention/remediation, and performs region-scoped cleanup. (New file + small go.mod toolchain update.)

Changes

Cohort / File(s) Change Summary
E2E Integration Test Runner
e2e_tests/e2e_integration_test.py
New file. Introduces global constants (S3_BUCKET_BASE_NAME, DEFAULT_REGION, E2E_TEST_REGIONS, LOG_RETENTION_DAYS) and an E2ETestRunner class with run() and helpers to: validate SAM CLI, create/reuse S3 bucket, package & deploy via SAM, obtain deployed Lambda name from CloudFormation outputs, invoke the Lambda with multiple test scenarios (non‑compliant, compliant, existing policy remediation, large-scale performance, invalid payload, permissions placeholder), poll CloudWatch Logs retention until expected, and clean up log groups, CloudFormation stack, and S3 artifacts with per-run unique naming and robust error handling.
Go toolchain metadata
go.mod
Minor toolchain metadata update: go directive set to 1.24.0 and added toolchain go1.24.5. No dependency or API changes.

Sequence Diagram

sequenceDiagram
    participant Main as Main (multi-region)
    participant Runner as E2ETestRunner
    participant SAM as SAM CLI
    participant S3 as S3
    participant CF as CloudFormation
    participant Lambda as Deployed Lambda
    participant Logs as CloudWatch Logs

    Main->>Runner: instantiate(region, run_id) & run()
    Runner->>SAM: verify SAM CLI present
    Runner->>S3: create or reuse bucket (region + run_id)
    Runner->>SAM: sam package (upload artifacts to S3)
    Runner->>CF: sam deploy (create/update stack)
    CF-->>Runner: stack outputs (Lambda name)
    Runner->>Logs: create test log groups
    Runner->>Lambda: invoke test payloads
    Lambda->>Logs: apply retention/policy changes
    Runner->>Logs: poll & verify retention settings
    alt success
        Runner->>CF: delete stack
        Runner->>S3: delete bucket & artifacts
        Runner->>Logs: delete test log groups
    else failure
        Runner->>CF: best-effort cleanup
        Runner->>S3: best-effort cleanup
        Runner->>Logs: best-effort cleanup
    end
    Runner-->>Main: return status
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

  • Inspect SAM subprocess invocation, CLI arguments, and error handling in packaging/deploy functions.
  • Verify CloudFormation outputs parsing and waiter/timeouts used to retrieve Lambda name.
  • Review log retention polling logic, timeouts, and potential race conditions.
  • Check S3 bucket creation/ownership handling and cleanup idempotence for buckets/objects.
  • Validate per-region/per-run naming to avoid collisions and any hardcoded assumptions.

Poem

🐰 I hopped a bucket up to the blue,
Packaged a stack and watched it brew.
I nudged the logs to mind their day,
Cleaned the stacks and bounded away.
A rabbit's run — deployed, checked, and true.

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The pull request title accurately summarizes the main change: adding end-to-end integration tests for LogGuardian, which is the primary focus of the changeset.
Docstring Coverage ✅ Passed Docstring coverage is 94.44% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch issues/11-e2e-test

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (4)
e2e_tests/e2e_integration_test.py (4)

88-103: Prefer logging.exception for better diagnostics.

The S3 bucket creation logic correctly handles the us-east-1 special case. However, line 102 should use logging.exception instead of logging.error to automatically include the stack trace in the logs.

Apply this diff:

         except self.s3_client.exceptions.ClientError as e:
-            logging.error(f"Failed to create S3 bucket: {e}")
+            logging.exception("Failed to create S3 bucket")
             raise

139-150: Add defensive check and improve error logging.

The method should handle the case where the stack might not have any outputs, and line 149 should use logging.exception for better diagnostics.

Apply this diff:

     def _get_lambda_function_name(self):
         """Retrieves the physical name of the deployed Lambda function from stack outputs."""
         try:
             response = self.cf_client.describe_stacks(StackName=self.stack_name)
-            outputs = response["Stacks"][0]["Outputs"]
+            outputs = response["Stacks"][0].get("Outputs", [])
             for output in outputs:
                 if output["OutputKey"] == "LogGuardianFunction":
                     return output["OutputValue"]
             raise Exception("Could not find Lambda function name in stack outputs.")
         except self.cf_client.exceptions.ClientError as e:
-            logging.error(f"Failed to describe stack '{self.stack_name}': {e}")
+            logging.exception(f"Failed to describe stack '{self.stack_name}'")
             raise

174-218: Remove unnecessary f-string prefixes.

Lines 191 and 207 have f-string prefixes but no placeholders. This is a minor style issue.

Apply this diff:

-        logging.info(f"\n--- Test Case 2: Compliant Log Group (Policy Matches) ---")
+        logging.info("\n--- Test Case 2: Compliant Log Group (Policy Matches) ---")

-        logging.info(f"\n--- Test Case 3: Non-Compliant Log Group (Incorrect Policy) ---")
+        logging.info("\n--- Test Case 3: Non-Compliant Log Group (Incorrect Policy) ---")

242-285: Use error code instead of message string for more reliable error checking.

Line 269 checks for the substring "does not exist" in the error message, which is fragile and could break if AWS changes the error message wording.

Apply this diff:

         except self.cf_client.exceptions.ClientError as e:
-            if "does not exist" in e.response['Error']['Message']:
+            if e.response['Error']['Code'] == 'ValidationError':
                 logging.info("Stack already deleted.")
             else:
                 logging.warning(f"Could not delete stack '{self.stack_name}': {e}")

Note on broad exception handling in cleanup.

Lines 259 and 284 catch broad Exception types. While normally discouraged, this is acceptable in cleanup code where the goal is to attempt deletion of all resources even if some fail. The warnings are logged, guiding manual cleanup if needed.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b647d8d and 41fb473.

📒 Files selected for processing (1)
  • e2e_tests/e2e_integration_test.py (1 hunks)
🧰 Additional context used
🪛 Ruff (0.14.6)
e2e_tests/e2e_integration_test.py

39-39: Standard pseudo-random generators are not suitable for cryptographic purposes

(S311)


83-83: Starting a process with a shell: seems safe, but may be changed in the future; consider rewriting without shell

(S605)


83-83: Starting a process with a partial executable path

(S607)


83-83: Starting a process with a shell: seems safe, but may be changed in the future; consider rewriting without shell

(S605)


83-83: Starting a process with a partial executable path

(S607)


85-85: Avoid specifying long messages outside the exception class

(TRY003)


102-102: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


115-115: Starting a process with a shell, possible injection detected

(S605)


116-116: Avoid specifying long messages outside the exception class

(TRY003)


127-127: Starting a process with a shell, possible injection detected

(S605)


128-128: Avoid specifying long messages outside the exception class

(TRY003)


147-147: Create your own exception

(TRY002)


147-147: Avoid specifying long messages outside the exception class

(TRY003)


149-149: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


191-191: f-string without any placeholders

Remove extraneous f prefix

(F541)


207-207: f-string without any placeholders

Remove extraneous f prefix

(F541)


238-238: Consider moving this statement to an else block

(TRY300)


259-259: Do not catch blind exception: Exception

(BLE001)


284-284: Do not catch blind exception: Exception

(BLE001)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: Build Container Binary
  • GitHub Check: Build AWS Lambda Function
  • GitHub Check: Build and Test Docker Container
🔇 Additional comments (5)
e2e_tests/e2e_integration_test.py (5)

33-54: LGTM! Good resource isolation pattern.

The use of a random run_id to create unique resource names for each test run is a solid approach to prevent collisions during parallel test execution or reruns.

Note: The static analysis warning about random.choices (line 39) not being cryptographically secure is a false positive in this context, as the run_id is only used for resource naming, not security purposes.


56-78: LGTM! Solid test lifecycle orchestration.

The try-except-finally structure ensures cleanup runs even on failure, and the non-zero exit code properly signals test failures to CI/CD systems.


80-86: Cross-platform check works correctly.

The dual check for both Windows (nul) and Unix (/dev/null) ensures the SAM CLI detection works across platforms. The static analysis warnings about shell execution are false positives here since the command is hardcoded.


230-240: LGTM! Correct log group retrieval logic.

The method properly uses prefix filtering followed by exact name matching to handle the CloudWatch Logs API behavior.


287-297: LGTM! Clean entry point.

The main block properly instantiates the test runner with configuration from module constants and executes the test suite.

Comment thread e2e_tests/e2e_integration_test.py Outdated
Comment thread e2e_tests/e2e_integration_test.py Outdated
Comment thread e2e_tests/e2e_integration_test.py Outdated
Applies fixes from Code Rabbit review:
- Replaces os.system with subprocess.run for safer command execution.
- Implements polling logic for Lambda remediation instead of fixed sleep.
- Adds defensive handling for existing log groups by deleting and recreating them to ensure a clean test state.
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
e2e_tests/e2e_integration_test.py (1)

140-155: CloudFormation stack waiter logic is still brittle for existing/updated stacks; prefer explicit status verification.

Using the stack_create_complete waiter here remains problematic for stacks that already exist and have gone through updates, and the ClientError-based branching won’t catch the typical WaiterError you get when the waiter encounters an unexpected StackStatus. This can cause spurious test failures when re-running against an existing stack or after template changes, even though sam deploy already waits for CloudFormation itself. (docs.aws.amazon.com)

You can simplify and harden this by dropping the create/update waiters entirely and just verifying the final stack status with a single describe_stacks call after a successful sam deploy:

-        logging.info("Waiting for stack deployment to complete...")
-        try:
-            waiter = self.cf_client.get_waiter('stack_create_complete')
-            waiter.wait(StackName=self.stack_name, WaiterConfig={'Delay': 15, 'MaxAttempts': 40})
-        except self.cf_client.exceptions.ClientError as e:
-            error_code = e.response.get("Error", {}).get("Code")
-            error_message = e.response.get("Error", {}).get("Message")
-            if error_code == "AlreadyExistsException" or "already exists" in error_message.lower():
-                logging.info("Stack already exists, waiting for update to complete...")
-                waiter = self.cf_client.get_waiter('stack_update_complete')
-                waiter.wait(StackName=self.stack_name, WaiterConfig={'Delay': 15, 'MaxAttempts': 40})
-            elif error_code == "ValidationError" and ("No updates are to be performed" in error_message or "ROLLBACK_COMPLETE" in error_message):
-                logging.warning(f"Stack '{self.stack_name}' is in a terminal state without active update: {error_message}. Proceeding.")
-            else:
-                raise
-        logging.info(f"Stack '{self.stack_name}' deployed/updated successfully.")
+        logging.info("Verifying CloudFormation stack status after deployment...")
+        try:
+            response = self.cf_client.describe_stacks(StackName=self.stack_name)
+        except self.cf_client.exceptions.ClientError as e:
+            logging.exception(f"Failed to describe stack '{self.stack_name}' after deployment")
+            raise
+
+        status = response["Stacks"][0]["StackStatus"]
+        if status not in ("CREATE_COMPLETE", "UPDATE_COMPLETE"):
+            raise RuntimeError(f"Unexpected stack status '{status}' after 'sam deploy'")
+
+        logging.info(f"Stack '{self.stack_name}' is in a healthy state: {status}.")

This makes re-runs against existing stacks reliable and keeps the logic aligned with what sam deploy already does.

🧹 Nitpick comments (1)
e2e_tests/e2e_integration_test.py (1)

223-247: Drop f prefixes from logging strings without interpolation.

A few logging.info calls are f-strings without any {} placeholders; they work but trigger Ruff’s F541 and add noise:

  • Line 223: logging.info(f"PASS: Log group was successfully remediated.")
  • Line 227: logging.info(f"\n--- Test Case 2: Compliant Log Group (Policy Matches) ---")
  • Line 247: logging.info(f"\n--- Test Case 3: Non-Compliant Log Group (Incorrect Policy) ---")

You can simplify them as plain strings:

-        logging.info(f"PASS: Log group was successfully remediated.")
+        logging.info("PASS: Log group was successfully remediated.")
...
-        logging.info(f"\n--- Test Case 2: Compliant Log Group (Policy Matches) ---")
+        logging.info("\n--- Test Case 2: Compliant Log Group (Policy Matches) ---")
...
-        logging.info(f"\n--- Test Case 3: Non-Compliant Log Group (Incorrect Policy) ---")
+        logging.info("\n--- Test Case 3: Non-Compliant Log Group (Incorrect Policy) ---")
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 41fb473 and f9ac3a9.

📒 Files selected for processing (1)
  • e2e_tests/e2e_integration_test.py (1 hunks)
🧰 Additional context used
🪛 Ruff (0.14.6)
e2e_tests/e2e_integration_test.py

40-40: Standard pseudo-random generators are not suitable for cryptographic purposes

(S311)


86-86: Starting a process with a partial executable path

(S607)


89-89: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


90-90: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


90-90: Avoid specifying long messages outside the exception class

(TRY003)


106-106: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


115-115: subprocess call: check for execution of untrusted input

(S603)


115-120: Starting a process with a partial executable path

(S607)


122-122: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


123-123: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


123-123: Avoid specifying long messages outside the exception class

(TRY003)


127-127: subprocess call: check for execution of untrusted input

(S603)


127-135: Starting a process with a partial executable path

(S607)


137-137: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


138-138: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


138-138: Avoid specifying long messages outside the exception class

(TRY003)


168-168: Create your own exception

(TRY002)


168-168: Avoid specifying long messages outside the exception class

(TRY003)


170-170: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


223-223: f-string without any placeholders

Remove extraneous f prefix

(F541)


227-227: f-string without any placeholders

Remove extraneous f prefix

(F541)


247-247: f-string without any placeholders

Remove extraneous f prefix

(F541)


283-283: Consider moving this statement to an else block

(TRY300)


304-304: Do not catch blind exception: Exception

(BLE001)


329-329: Do not catch blind exception: Exception

(BLE001)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Build and Test Docker Container
🔇 Additional comments (1)
e2e_tests/e2e_integration_test.py (1)

28-80: Overall E2E runner structure is solid.

Nice job encapsulating the full lifecycle (SAM check, S3 bucket, deploy, three scenarios, cleanup) inside E2ETestRunner.run() with a try/except/finally that guarantees cleanup and surfaces failures via a non‑zero exit. The randomized run_id-based naming also avoids resource collisions between runs.

This commit introduces a new, easier-to-use configuration management system for LogGuardian, addressing several acceptance criteria:

- **YAML-based configuration files:** The application now uses 'config.yaml' as its primary configuration source, replacing environment variables for core settings.
- **Configuration validation and error reporting:** Integrated 'go-playground/validator/v10' for robust validation of 'config.yaml', providing clear error messages for misconfigurations.
- **Environment-specific configuration templates:** 'config.<env>.yaml' files are supported, allowing for environment-specific overrides based on the 'ENVIRONMENT' variable.
- **Configuration migration tools for updates:** A core 'Migrate' function is implemented to convert old 'env.json' settings to the new 'config.yaml' format.
- **Configuration backup and restore capabilities:** Demonstrated functionality for backing up and restoring 'config.yaml' files.

The 'Visual configuration validation tools' criterion remains out of scope for this implementation.
Copy link
Copy Markdown

@godsonten godsonten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As per the AC:

  • Multi-region testing capabilities

    • Partially Met: The script can be run in different regions via the AWS_REGION environment variable, but it doesn't orchestrate tests across multiple regions in a single run. This is a
      good first step.
    • Performance testing with large log group counts
    • Not Met: The tests only create a few log groups and lack a test case for performance under a high volume of resources.
  • Error scenario testing and recovery validation

    • Not Met: The tests focus on successful remediation paths and do not cover failure modes like invalid payloads or permission errors.

Areas for Improvement:

  1. Performance Testing: A new test case could be added to create a large number of log groups and invoke the Lambda on all of them (or a subset) to see how it performs. This might require
    a different invocation strategy than one-by-one.
  2. Error Scenario Testing: Additional test cases should be added to simulate failure modes. This would require intentionally setting up scenarios with incorrect permissions or malformed
    events to validate the application's error handling and recovery.
  3. Hardcoded Values: The LOG_RETENTION_DAYS is hardcoded. While it's a constant, it might be better to fetch this from the template.yaml's default parameter value to make the test more
    resilient to changes in the SAM template. This is a minor point, however.
  4. Security: The use of subprocess.run is much better than os.system or subprocess.call with shell=True. The implementation is safe.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (5)
e2e_tests/e2e_integration_test.py (5)

28-40: Good defensive approach to configuration loading.

The helper function appropriately handles failures when reading the template and falls back to a sensible default. This ensures the test script is resilient.

Optional: Consider using logging.exception instead of logging.error on line 35 to automatically include the traceback, which can aid debugging if the template format changes.


267-267: Remove unnecessary f-string prefix.

Line 267 uses an f-string without any placeholders. This is harmless but unnecessary.

-        logging.info(f"\n--- Test Case 2: Compliant Log Group (Policy Matches) ---")
+        logging.info("\n--- Test Case 2: Compliant Log Group (Policy Matches) ---")

288-288: Remove unnecessary f-string prefix.

Same as the previous comment - line 288 has an f-string without placeholders.

-        logging.info(f"\n--- Test Case 3: Non-Compliant Log Group (Incorrect Policy) ---")
+        logging.info("\n--- Test Case 3: Non-Compliant Log Group (Incorrect Policy) ---")

395-442: Comprehensive cleanup with appropriate error handling.

The cleanup method properly handles all resource types (log groups, CloudFormation stack, S3 bucket) and gracefully handles errors without failing the test, which is the correct behavior for cleanup.

Optional: Line 400-404 could use iterable unpacking for slightly cleaner code:

-        log_groups_to_delete = [
-            self.log_group_name_non_compliant,
-            self.log_group_name_compliant,
-            self.log_group_name_remediate,
-        ] + self.performance_log_groups # Add performance log groups to cleanup
+        log_groups_to_delete = [
+            self.log_group_name_non_compliant,
+            self.log_group_name_compliant,
+            self.log_group_name_remediate,
+            *self.performance_log_groups # Add performance log groups to cleanup
+        ]

444-461: Consider adding AWS credentials validation at startup.

The script will fail with cryptic boto3 errors if AWS credentials are not configured. Adding an upfront check would provide clearer feedback to users.

Add this check before the main loop:

if __name__ == "__main__":
    # Validate AWS credentials are configured
    try:
        sts = boto3.client('sts')
        sts.get_caller_identity()
        logging.info("AWS credentials validated successfully.")
    except Exception as e:
        logging.error(f"AWS credentials not configured or invalid: {e}")
        logging.error("Please configure AWS credentials using 'aws configure' or environment variables.")
        sys.exit(1)
    
    # Run tests for each specified region
    for region in E2E_TEST_REGIONS:
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9dfbf6e and 74bf115.

📒 Files selected for processing (1)
  • e2e_tests/e2e_integration_test.py (1 hunks)
🧰 Additional context used
🪛 Ruff (0.14.6)
e2e_tests/e2e_integration_test.py

34-34: Do not catch blind exception: Exception

(BLE001)


35-35: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


57-57: Standard pseudo-random generators are not suitable for cryptographic purposes

(S311)


108-108: Starting a process with a partial executable path

(S607)


111-111: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


112-112: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


112-112: Avoid specifying long messages outside the exception class

(TRY003)


131-131: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


140-140: subprocess call: check for execution of untrusted input

(S603)


140-146: Starting a process with a partial executable path

(S607)


148-148: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


149-149: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


149-149: Avoid specifying long messages outside the exception class

(TRY003)


153-153: subprocess call: check for execution of untrusted input

(S603)


153-161: Starting a process with a partial executable path

(S607)


163-163: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


164-164: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


164-164: Avoid specifying long messages outside the exception class

(TRY003)


194-194: Create your own exception

(TRY002)


194-194: Avoid specifying long messages outside the exception class

(TRY003)


196-196: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


267-267: f-string without any placeholders

Remove extraneous f prefix

(F541)


288-288: f-string without any placeholders

Remove extraneous f prefix

(F541)


380-380: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


391-391: Consider moving this statement to an else block

(TRY300)


400-404: Consider iterable unpacking instead of concatenation

(RUF005)


412-412: Do not catch blind exception: Exception

(BLE001)


426-426: Do not catch blind exception: Exception

(BLE001)


441-441: Do not catch blind exception: Exception

(BLE001)


455-455: Do not catch blind exception: Exception

(BLE001)


456-456: Use logging.exception instead of logging.error

Replace with exception

(TRY400)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: Build Container Binary
  • GitHub Check: Build and Test Docker Container
  • GitHub Check: Build AWS Lambda Function
🔇 Additional comments (3)
e2e_tests/e2e_integration_test.py (3)

51-74: LGTM! Excellent resource naming strategy.

The initialization properly generates unique resource names using run_id and region, which prevents collisions during concurrent or repeated test runs. The use of random.choices for test resource naming is appropriate (despite static analysis warnings about cryptographic security, which don't apply here).


231-246: Excellent polling implementation.

This properly replaces the fixed sleep approach with configurable polling, making tests more reliable and faster. The timeout logic and final verification are well-implemented.


364-381: Excellent defensive handling of existing resources.

The helper properly handles the case where log groups already exist by deleting and recreating them, ensuring a clean test state. This addresses potential issues from interrupted test runs.

Comment thread e2e_tests/e2e_integration_test.py Outdated
Comment thread e2e_tests/e2e_integration_test.py
Comment thread e2e_tests/e2e_integration_test.py Outdated
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (5)
e2e_tests/e2e_integration_test.py (5)

29-38: Improve exception handling and document fallback value.

The broad Exception catch and generic error logging make debugging harder. The fallback value 14 is also undocumented.

Consider this improvement:

 def get_default_retention_days_from_template(template_path="template.yaml"):
+    """Reads DefaultRetentionDays from template.yaml, falling back to 14 days on error."""
     try:
         with open(template_path, 'r') as f:
             template = yaml.safe_load(f)
         # Assuming DefaultRetentionDays is directly under Parameters
         return template['Parameters']['DefaultRetentionDays']['Default']
-    except Exception as e:
-        logging.error(f"Could not read DefaultRetentionDays from {template_path}: {e}")
+    except (FileNotFoundError, KeyError, yaml.YAMLError) as e:
+        logging.exception(f"Could not read DefaultRetentionDays from {template_path}, using fallback of 14 days")
         # Fallback to a default if reading fails
         return 14

313-344: Consider the polling timeout and sequencing in performance test.

The performance test creates 50 log groups and polls each sequentially with a 180-second timeout (line 336). If remediation happens sequentially or experiences delays, the test might timeout prematurely. Additionally, the test breaks on the first failure (line 338), which prevents gathering data on all log groups.

Consider these improvements:

         # Poll for remediation
         all_remediated = True
+        failed_log_groups = []
         for lg_name in self.performance_log_groups:
-            if not self._poll_for_retention_policy(lg_name, self.retention_days, timeout=180, interval=10): # Increased timeout for bulk
+            if not self._poll_for_retention_policy(lg_name, self.retention_days, timeout=300, interval=10): # 5 minutes per log group
                 all_remediated = False
-                break
+                failed_log_groups.append(lg_name)
+                # Continue polling others to gather full results
         
         end_time = time.time()
         total_time = end_time - start_time
         
-        assert all_remediated, f"FAIL: Not all performance log groups were remediated in {self.region}."
+        assert all_remediated, f"FAIL: {len(failed_log_groups)} out of {count} performance log groups failed remediation in {self.region}. Failed: {failed_log_groups}"
         logging.info(f"PASS: Performance test with {count} log groups completed in {self.region}. Total time: {total_time:.2f} seconds.")

389-391: Use logging.exception for better error diagnostics.

When re-raising exceptions, logging.exception automatically includes the stack trace, which aids debugging.

Apply this diff:

         except Exception as e:
-            logging.error(f"Failed to create log group '{name}' in {self.region}: {e}")
+            logging.exception(f"Failed to create log group '{name}' in {self.region}")
             raise

440-452: Handle non-existent S3 bucket gracefully during cleanup.

If the S3 bucket was never created (e.g., test failed during setup), the cleanup will attempt to empty and delete a non-existent bucket, which might raise NoSuchBucket errors. While this is caught by the broad Exception handler, explicitly checking for bucket existence improves clarity.

Apply this diff:

         try:
             logging.info(f"Deleting S3 bucket: {self.s3_bucket_name} in {self.region}")
+            # Check if bucket exists first
+            try:
+                self.s3_client.head_bucket(Bucket=self.s3_bucket_name)
+            except self.s3_client.exceptions.NoSuchBucket:
+                logging.info(f"S3 bucket '{self.s3_bucket_name}' does not exist in {self.region}. Skipping deletion.")
+                return
+            
             # Empty the bucket before deletion
             paginator = self.s3_client.get_paginator('list_object_versions')
             for page in paginator.paginate(Bucket=self.s3_bucket_name):

465-467: Use logging.exception for comprehensive error reporting.

The main execution block catches all exceptions but only logs the error message. Using logging.exception would include the full stack trace, making debugging easier.

Apply this diff:

         except Exception as e:
-            logging.error(f"❌ Test failed in region {region}: {e}")
+            logging.exception(f"❌ Test failed in region {region}")
             sys.exit(1) # Exit with an error code if any region fails
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 74bf115 and f70daca.

📒 Files selected for processing (1)
  • e2e_tests/e2e_integration_test.py (1 hunks)
🧰 Additional context used
🪛 Ruff (0.14.6)
e2e_tests/e2e_integration_test.py

35-35: Do not catch blind exception: Exception

(BLE001)


36-36: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


58-58: Standard pseudo-random generators are not suitable for cryptographic purposes

(S311)


109-109: Starting a process with a partial executable path

(S607)


112-112: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


113-113: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


113-113: Avoid specifying long messages outside the exception class

(TRY003)


132-132: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


141-141: subprocess call: check for execution of untrusted input

(S603)


141-147: Starting a process with a partial executable path

(S607)


149-149: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


150-150: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


150-150: Avoid specifying long messages outside the exception class

(TRY003)


154-154: subprocess call: check for execution of untrusted input

(S603)


154-162: Starting a process with a partial executable path

(S607)


164-164: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


165-165: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


165-165: Avoid specifying long messages outside the exception class

(TRY003)


176-176: Avoid specifying long messages outside the exception class

(TRY003)


185-185: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


186-186: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


186-186: Avoid specifying long messages outside the exception class

(TRY003)


199-199: Create your own exception

(TRY002)


199-199: Avoid specifying long messages outside the exception class

(TRY003)


201-201: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


231-231: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


277-277: f-string without any placeholders

Remove extraneous f prefix

(F541)


298-298: f-string without any placeholders

Remove extraneous f prefix

(F541)


390-390: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


401-401: Consider moving this statement to an else block

(TRY300)


410-414: Consider iterable unpacking instead of concatenation

(RUF005)


422-422: Do not catch blind exception: Exception

(BLE001)


436-436: Do not catch blind exception: Exception

(BLE001)


451-451: Do not catch blind exception: Exception

(BLE001)


465-465: Do not catch blind exception: Exception

(BLE001)


466-466: Use logging.exception instead of logging.error

Replace with exception

(TRY400)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Build and Test Docker Container

Comment thread e2e_tests/e2e_integration_test.py Outdated
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (3)
e2e_tests/e2e_integration_test.py (3)

29-38: Ensure type consistency for retention days value.

The return value from template['Parameters']['DefaultRetentionDays']['Default'] may be a string or integer depending on the YAML structure. If the template contains Default: "14" (string) instead of Default: 14 (integer), this could cause issues later when the value is used in API calls that expect an integer.

Apply this diff to ensure consistent integer type:

 def get_default_retention_days_from_template(template_path="template.yaml"):
     try:
         with open(template_path, 'r') as f:
             template = yaml.safe_load(f)
         # Assuming DefaultRetentionDays is directly under Parameters
-        return template['Parameters']['DefaultRetentionDays']['Default']
+        return int(template['Parameters']['DefaultRetentionDays']['Default'])
     except Exception as e:
         logging.error(f"Could not read DefaultRetentionDays from {template_path}: {e}")
         # Fallback to a default if reading fails
         return 14

284-284: Remove unnecessary f-string prefix.

These strings don't contain any placeholders, so the f prefix is unnecessary.

Apply this diff:

-        logging.info(f"\n--- Test Case 2: Compliant Log Group (Policy Matches) ---")
+        logging.info("\n--- Test Case 2: Compliant Log Group (Policy Matches) ---")
-        logging.info(f"\n--- Test Case 3: Non-Compliant Log Group (Incorrect Policy) ---")
+        logging.info("\n--- Test Case 3: Non-Compliant Log Group (Incorrect Policy) ---")

Also applies to: 305-305


417-421: Consider using list unpacking for better readability.

The current list concatenation works but can be simplified.

Apply this diff to use iterable unpacking:

         # List of log groups to delete
-        log_groups_to_delete = [
+        log_groups_to_delete = [
+            *self.performance_log_groups,
             self.log_group_name_non_compliant,
             self.log_group_name_compliant,
             self.log_group_name_remediate,
-        ] + self.performance_log_groups # Add performance log groups to cleanup
+        ]
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f70daca and 1dff4db.

📒 Files selected for processing (1)
  • e2e_tests/e2e_integration_test.py (1 hunks)
🧰 Additional context used
🪛 Ruff (0.14.6)
e2e_tests/e2e_integration_test.py

35-35: Do not catch blind exception: Exception

(BLE001)


36-36: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


58-58: Standard pseudo-random generators are not suitable for cryptographic purposes

(S311)


109-109: Starting a process with a partial executable path

(S607)


112-112: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


113-113: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


113-113: Avoid specifying long messages outside the exception class

(TRY003)


132-132: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


141-141: subprocess call: check for execution of untrusted input

(S603)


141-147: Starting a process with a partial executable path

(S607)


149-149: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


150-150: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


150-150: Avoid specifying long messages outside the exception class

(TRY003)


154-154: subprocess call: check for execution of untrusted input

(S603)


154-162: Starting a process with a partial executable path

(S607)


164-164: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


165-165: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


165-165: Avoid specifying long messages outside the exception class

(TRY003)


176-176: Avoid specifying long messages outside the exception class

(TRY003)


193-193: Avoid specifying long messages outside the exception class

(TRY003)


206-206: Create your own exception

(TRY002)


206-206: Avoid specifying long messages outside the exception class

(TRY003)


208-208: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


238-238: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


284-284: f-string without any placeholders

Remove extraneous f prefix

(F541)


305-305: f-string without any placeholders

Remove extraneous f prefix

(F541)


397-397: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


408-408: Consider moving this statement to an else block

(TRY300)


417-421: Consider iterable unpacking instead of concatenation

(RUF005)


429-429: Do not catch blind exception: Exception

(BLE001)


443-443: Do not catch blind exception: Exception

(BLE001)


458-458: Do not catch blind exception: Exception

(BLE001)


472-472: Do not catch blind exception: Exception

(BLE001)


473-473: Use logging.exception instead of logging.error

Replace with exception

(TRY400)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Build and Test Docker Container
🔇 Additional comments (2)
e2e_tests/e2e_integration_test.py (2)

77-103: Excellent test structure and lifecycle management!

The test orchestration is well-designed with clear phases (setup, test execution, cleanup), proper error handling with detailed logging, and guaranteed cleanup via the finally block. The multi-region support with resource isolation is a nice touch.


461-478: Good multi-region orchestration with proper error handling.

The main execution properly iterates through regions, handles failures with appropriate exit codes, and provides clear success/failure messaging. The .strip() on line 467 is good defensive programming for parsing environment variables.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (8)
e2e_tests/e2e_integration_test.py (8)

28-41: Make template.yaml lookup robust to working directory changes.

get_default_retention_days_from_template assumes template.yaml is in the current working directory. If someone runs this script from another folder (e.g., via python e2e_tests/e2e_integration_test.py from a different CWD), it will silently fall back to 14 rather than using the real template value.

Consider resolving the path relative to this file instead:

-import os
+import os
+from pathlib import Path
...
-def get_default_retention_days_from_template(template_path="template.yaml"):
+def get_default_retention_days_from_template(
+    template_path: str | None = None,
+):
     try:
-        with open(template_path, 'r') as f:
+        if template_path is None:
+            template_path = Path(__file__).resolve().parent.parent / "template.yaml"
+        with open(template_path, 'r') as f:
             template = yaml.safe_load(f)

This keeps the behavior deterministic regardless of where the script is launched from.


35-38: Clean up unused exception variables and consider preserving original tracebacks.

Several except ... as e: blocks only call logging.exception(...) and then re‑raise (or raise a new exception) without using e, which triggers Ruff’s F841 and slightly obscures intent:

  • Lines 35–38, 98–100, 207–209, 396–398, 457–458, 471–472.

You can either:

  • Drop the unused name:
-    except Exception as e:
+    except Exception:
         logging.exception("...")
         raise

or, where you raise a new exception, explicitly chain from the original:

-    except FileNotFoundError as e:
-        logging.exception("SAM CLI is not installed ...")
-        raise EnvironmentError("SAM CLI not found.")
+    except FileNotFoundError as e:
+        logging.exception("SAM CLI is not installed ...")
+        raise EnvironmentError("SAM CLI not found.") from e

This both satisfies Ruff and keeps exception causality clearer during debugging.

Also applies to: 98-100, 207-209, 396-398, 457-458, 471-472


52-76: Run‑ID randomness is fine here; optionally silence S311 if desired.

Ruff’s S311 warning on Line 58 (random.choices) isn’t a real problem in this context—the run ID only needs to avoid S3/stack name collisions, not provide cryptographic security. If you want to keep the linter quiet without changing behavior, you can add a targeted noqa:

-        self.run_id = ''.join(random.choices(string.ascii_lowercase + string.digits, k=8))
+        self.run_id = ''.join(  # noqa: S311 - non‑cryptographic randomness is fine for test IDs
+            random.choices(string.ascii_lowercase + string.digits, k=8)
+        )

Or, if you prefer, switch to secrets.token_hex(4) for a slightly simpler, crypto‑safe ID.


77-104: Be aware that cleanup exceptions can mask the original test failure.

run() re‑raises any test failure (asserts, AWS errors, etc.) and then always calls _cleanup() in finally. If _cleanup() raises (e.g., stack deletion waiter fails), that exception will replace the original one, so CI will report a cleanup failure instead of the root cause.

If you’d rather keep the original error while still attempting cleanup, you can capture and log cleanup failures without letting them overwrite the main exception:

-        except Exception as e:
-            logging.exception("❌ An error occurred during the integration test in %s.", self.region)
-            raise
-        finally:
-            # --- Cleanup Phase ---
-            self._cleanup()
+        except Exception:
+            logging.exception("❌ An error occurred during the integration test in %s.", self.region)
+            raise
+        finally:
+            # --- Cleanup Phase ---
+            try:
+                self._cleanup()
+            except Exception:
+                logging.exception("⚠️ Cleanup failed; AWS resources may remain in %s.", self.region)
+                # Let the original test failure (if any) propagate

This keeps cleanup issues visible in logs without hiding the primary failure.


115-134: Use structured error codes instead of string matching for S3 bucket errors.

In _setup_s3_bucket, the ClientError handling checks if "BucketAlreadyOwnedByYou" in str(e):, which is brittle and depends on error message text.

It’s safer to branch on the AWS error code:

-        except ClientError as e:
-            if "BucketAlreadyOwnedByYou" in str(e):
+        except ClientError as e:
+            error_code = e.response.get("Error", {}).get("Code")
+            if error_code == "BucketAlreadyOwnedByYou":
                 logging.warning(f"S3 bucket '{self.s3_bucket_name}' already exists and owned by you. Proceeding.")
             else:
                 logging.exception(
                     f"Failed to create S3 bucket '{self.s3_bucket_name}' in {self.region}."
                 )
                 raise

This will be more robust if AWS ever changes the error message wording.


248-263: Performance test may be slow; consider making it configurable.

_test_performance_large_log_groups creates count log groups (default 50) and, for each, runs _poll_for_retention_policy with timeout=180. In the worst case (first group never remediated), this can add several minutes to a single test run.

If these e2e tests are intended to run regularly in CI, you might want to:

  • Make count, timeout, and interval overridable via environment variables, or
  • Gate the performance test behind a flag (e.g., E2E_ENABLE_PERF_TEST) so you can skip or downsize it in standard pipelines.

That keeps the test useful for load validation without disproportionately lengthening normal runs.

Also applies to: 320-352


234-240: Use logging.exception in the JSON parsing error path for better debugging.

In the JSONDecodeError branch you currently log with logging.error, which loses the stack trace and raw payload:

except json.JSONDecodeError as e:
    logging.error(f"Failed to parse Lambda response payload: {e}")
    return False, {"error": "Invalid JSON response from Lambda"}

Switching to logging.exception preserves full context:

-        except json.JSONDecodeError as e:
-            logging.error(f"Failed to parse Lambda response payload: {e}")
+        except json.JSONDecodeError:
+            logging.exception("Failed to parse Lambda response payload.")
             return False, {"error": "Invalid JSON response from Lambda"}

This aligns with the rest of the file’s error‑logging style and addresses Ruff’s TRY400 suggestion.


412-459: Cleanup flow is thorough; consider also catching WaiterError for stack deletion.

The cleanup logic for log groups, stack, and S3 bucket is comprehensive and well‑logged. One minor gap: for stack deletion you only catch ClientError, but the stack_delete_complete waiter can raise WaiterError if the stack enters a terminal failure state.

If you want more consistent error reporting (while still failing the test), you can extend the exception list:

-from botocore.exceptions import WaiterError, ClientError
+from botocore.exceptions import WaiterError, ClientError
...
-        except ClientError as e:
+        except (ClientError, WaiterError) as e:
             if isinstance(e, ClientError) and "does not exist" in e.response['Error']['Message']:
                 logging.info(f"Stack in {self.region} already deleted.")
             else:
-                logging.exception(f"Could not delete stack '{self.stack_name}' in {self.region}.")
+                logging.exception(f"Could not delete stack '{self.stack_name}' in {self.region}.")

This way, failures during deletion are logged with full context whether they originate from the API call or the waiter.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1dff4db and c8db2e7.

📒 Files selected for processing (1)
  • e2e_tests/e2e_integration_test.py (1 hunks)
🧰 Additional context used
🪛 Ruff (0.14.6)
e2e_tests/e2e_integration_test.py

35-35: Local variable e is assigned to but never used

Remove assignment to unused variable e

(F841)


58-58: Standard pseudo-random generators are not suitable for cryptographic purposes

(S311)


98-98: Local variable e is assigned to but never used

Remove assignment to unused variable e

(F841)


109-109: Starting a process with a partial executable path

(S607)


113-113: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


113-113: Avoid specifying long messages outside the exception class

(TRY003)


141-141: subprocess call: check for execution of untrusted input

(S603)


141-147: Starting a process with a partial executable path

(S607)


150-150: Avoid specifying long messages outside the exception class

(TRY003)


154-154: subprocess call: check for execution of untrusted input

(S603)


154-162: Starting a process with a partial executable path

(S607)


165-165: Avoid specifying long messages outside the exception class

(TRY003)


176-176: Avoid specifying long messages outside the exception class

(TRY003)


193-193: Avoid specifying long messages outside the exception class

(TRY003)


206-206: Create your own exception

(TRY002)


206-206: Avoid specifying long messages outside the exception class

(TRY003)


207-207: Local variable e is assigned to but never used

Remove assignment to unused variable e

(F841)


238-238: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


396-396: Local variable e is assigned to but never used

Remove assignment to unused variable e

(F841)


408-408: Consider moving this statement to an else block

(TRY300)


457-457: Local variable e is assigned to but never used

Remove assignment to unused variable e

(F841)


471-471: Local variable e is assigned to but never used

Remove assignment to unused variable e

(F841)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: Build AWS Lambda Function
  • GitHub Check: Build Container Binary
  • GitHub Check: Build and Test Docker Container
🔇 Additional comments (4)
e2e_tests/e2e_integration_test.py (4)

135-197: CloudFormation deploy + waiter logic looks solid.

Nice handling of the SAM package/deploy flow and CFN waiters:

  • Differentiating CREATE_IN_PROGRESS vs UPDATE_IN_PROGRESS and routing to the appropriate waiter.
  • Treating CREATE_COMPLETE / UPDATE_COMPLETE as a no‑op deployment (empty changeset).
  • Explicitly failing fast on any ROLLBACK* status and catching WaiterError to surface deploy issues.

This should make deployments deterministic across both first‑time and incremental runs.


381-399: _create_log_group behavior looks robust for idempotent test runs.

The helper cleanly handles existing log groups by deleting and recreating them, then re‑applying the retention policy. This guarantees a known state across reruns and nicely complements the performance tests and remediation checks.

No changes needed here.


460-477: Main loop behavior is clear and failure‑friendly.

The per‑region loop, combined with sys.exit(1) on the first failure, gives a clear “all‑or‑nothing” signal to CI while still ensuring cleanup is attempted inside each E2ETestRunner. Logging with region context is also helpful for diagnosing which region failed.

Nothing to change here.


211-247: I'll verify the review comment by examining the test code and the Lambda implementation to understand the current error handling behavior.
<function_calls>
<function_calls>

<function_name>shell</function_name>
<invoke-function_body>#!/bin/bash

First, let me find the e2e_integration_test.py file

find . -name "e2e_integration_test.py" -type f
</invoke-function_body>
</invoke-function_calls>
</function_calls>

Let me search for the test file and examine its structure:
<function_calls>
<function_calls>

<function_name>shell</function_name>
<invoke-function_body>#!/bin/bash

Get file listing to understand repo structure

git ls-files | grep -E "(test|lambda)" | head -20

</invoke-function_calls>
</function_calls>

Now let me examine the specific test file sections mentioned in the review:
<function_calls>
<function_calls>

<function_name>shell</function_name>
<invoke-function_body>#!/bin/bash

Look for the e2e test file

find . -name "e2e" -type f 2>/dev/null | head -20
</invoke-function_body>
</invoke-function_calls>
</function_calls>

Let me try a broader search:
<function_calls>
<function_calls>

<function_name>shell</function_name>
<invoke-function_body>#!/bin/bash

Search for files with 'integration_test' in the name

fd ".*integration.*test" --type f
</invoke-function_body>
</invoke-function_calls>
</function_calls>

<function_calls>
<function_calls>

<function_name>shell</function_name>
<invoke-function_body>#!/bin/bash

List all test files

fd ".*test.py" --type f
</invoke-function_body>
</invoke-function_calls>
</function_calls>

<function_calls>
<function_calls>

<function_name>shell</function_name>
<invoke-function_body>#!/bin/bash

Let me check the root directory structure

ls -la | head -20
</invoke-function_body>
</invoke-function_calls>
</function_calls>

<function_calls>
<function_calls>

<function_name>shell</function_name>
<invoke-function_body>#!/bin/bash

List all files recursively with Python test files

find . -name "*.py" -path "test" -type f 2>/dev/null | head -20
</invoke-function_body>
</invoke-function_calls>
</function_calls>

<function_calls>
<function_calls>

<function_name>shell</function_name>
<invoke-function_body>#!/bin/bash

Get overall repo structure

find . -type d -name "e2e*" 2>/dev/null

</invoke-function_calls>
</function_calls>

<function_calls>
<function_calls>

<function_name>shell</function_name>
<invoke-function_body>#!/bin/bash

Search for all Python files

find . -name "*.py" -type f 2>/dev/null | sort
</invoke-function_body>
</invoke-function_calls>
</function_calls>

<function_calls>
<function_calls>

<function_name>shell</function_name>
<invoke-function_body>#!/bin/bash

Try a direct file read from the path mentioned in the review

head -n 250 e2e_tests/e2e_integration_test.py 2>/dev/null || echo "File not found at that path"
</invoke-function_body>
</invoke-function_calls>
</function_calls>

Let me check what files exist in the repository:
<function_calls>
<function_calls>

<function_name>shell</function_name>
<invoke-function_body>#!/bin/bash

List all files in the repo

git ls-files | head -50
</invoke-function_body>
</invoke-function_calls>
</function_calls>

Now let me read the specific test file sections mentioned in the review:
<function_calls>
<function_calls>

<function_name>shell</function_name>
<invoke-function_body>#!/bin/bash

Read lines 211-247 of the test file to see _invoke_lambda

sed -n '211,247p' e2e_tests/e2e_integration_test.py
</invoke-function_body>
</invoke-function_calls>
</function_calls>

<function_calls>
<function_calls>

<function_name>shell</function_name>
<invoke-function_body>#!/bin/bash

Read lines 353-371 to see _test_invalid_payload

sed -n '353,371p' e2e_tests/e2e_integration_test.py
</invoke-function_body>
</invoke-function_calls>
</function_calls>

<function_calls>
<function_calls>
<invoke-function_name>shell</function_name>
<invoke-function_body>#!/bin/bash

Get line count of the file

wc -l e2e_tests/e2e_integration_test.py
</invoke-function_name>
</invoke-function_calls>

Let me try to read the entire e2e integration test file:
<function_calls>
<function_calls>
<invoke-function_name>shell</function_name>
<invoke-function_body>#!/bin/bash

Read the entire file

cat -n e2e_tests/e2e_integration_test.py | head -400
</invoke-function_name>
</invoke-function_calls>

<function_calls>
<function_calls>
<invoke-function_name>shell</function_name>
<invoke-function_body>#!/bin/bash

Read the rest of the file

cat -n e2e_tests/e2e_integration_test.py | tail -100
</invoke-function_name>
</invoke-function_calls>

Now let me find the Lambda function implementation:
<function_calls>
<function_calls>
<invoke-function_name>shell</function_name>
<invoke-function_body>#!/bin/bash

Search for Lambda handler or main function

find . -name "*.py" -type f | xargs grep -l "def lambda_handler|def handler" 2>/dev/null
</invoke-function_name>
</invoke-function_calls>

<function_calls>
<function_calls>
<invoke-function_name>shell</function_name>
<invoke-function_body>#!/bin/bash

Look for files in src or similar directories that might be Lambda code

find . -name "*.py" -type f | grep -v pycache | grep -v ".egg" | sort
</invoke-function_name>
</invoke-function_calls>

<function_calls>
<function_calls>
<invoke-function_name>shell</function_name>
<invoke-function_body>#!/bin/bash

Check if there's a lambda_function.py or similar

ls -la | grep -i lambda
</invoke-function_name>
</invoke-function_calls>

<function_calls>
<function_calls>
<invoke-function_name>shell</function_name>
<invoke-function_body>#!/bin/bash

Search for error handling patterns in Python files

rg -n "FunctionError|statusCode|except" --type py | head -50
</invoke-function_name>
</invoke-function_calls>

<function_calls>
<function_calls>
<invoke-function_name>shell</function_name>
<invoke-function_body>#!/bin/bash

Get more context about the repository structure

echo "=== Directory structure ===" && find . -type d -not -path '/.' | head -20 && echo -e "\n=== All Python files ===" && find . -name ".py" -not -path '/.*' -type f
</invoke-function_name>
</invoke-function_calls>

Let me look for the actual test code more directly:
<function_calls>
<function_calls>
<invoke-function_name>shell</function_name>
<invoke-function_body>#!/bin/bash

Search for _invoke_lambda method

rg -n "_invoke_lambda" --type py -A 30
</invoke-function_name>
</invoke-function_calls>

<function_calls>
<function_calls>
<invoke-function_name>shell</function_name>
<invoke-function_body>#!/bin/bash

Search for _test_invalid_payload method

rg -n "_test_invalid_payload|test_invalid_payload" --type py -A 20
</invoke-function_name>
</invoke-function_calls>

<function_calls>
<function_calls>
<invoke-function_name>shell</function_name>
<invoke-function_body>#!/bin/bash

Search for all test methods that might be related

rg -n "def.*test.*invalid" --type py -B 2 -A 15
</invoke-function_name>
</invoke-function_calls>

<function_calls>
<function_calls>
<invoke-function_name>shell</function_name>
<invoke-function_body>#!/bin/bash

Search for LogGuardian Lambda references

rg -n "LogGuardian|lambda_function_name" --type py -B 2 -A 2
</invoke-function_name>
</invoke-function_calls>

<function_calls>
<function_calls>
<invoke-function_name>shell</function_name>
<invoke-function_body>#!/bin/bash

Try to find any JSON error response patterns

rg -n "statusCode.400|status.>=.*400" --type py -B 2 -A 2
</invoke-function_name>
</invoke-function_calls>

<function_calls>
<function_calls>
<invoke-function_name>shell</function_name>
<invoke-function_body>#!/bin/bash

Search for how Lambda errors are currently handled

rg -n "FunctionError|response['FunctionError" --type py -B 3 -A 3
</invoke-function_name>
</invoke-function_calls>

Let me check if there's a different branch or if the file structure is different:
<function_calls>
<function_calls>
<invoke-function_name>shell</function_name>
<invoke-function_body>#!/bin/bash

Check git status and current branch

git status
</invoke-function_name>
</invoke-function_calls>

<function_calls>
<function_calls>
<invoke-function_name>shell</function_name>
<invoke-function_body>#!/bin/bash

Check git log to see recent changes

git log --oneline | head -10
</invoke-function_name>
</invoke-function_calls>

<function_calls>
<function_calls>
<invoke-function_name>shell</function_name>
<invoke-function_body>#!/bin/bash

Check if this is a PR and get the diff

git diff HEAD~1...HEAD --stat 2>/dev/null || git show --stat 2>/dev/null | head -50
</invoke-function_name>
</invoke-function_calls>

<function_calls>
<function_calls>
<invoke-function_name>shell</function_name>
<invoke-function_body>#!/bin/bash

List files with test in the name

git ls-files | grep -i test
</invoke-function_name>
</invoke-function_calls>

<function_calls>
<function_calls>
<invoke-function_name>shell</function_name>
<invoke-function_body>#!/bin/bash

Check current directory and what's available

pwd && ls -la && echo "=== Git check ===" && git remote -v 2>&1 | head -5
</invoke-function_name>
</invoke-function_calls>

Based on my attempts to verify the review comment, I'm encountering challenges accessing the repository files in the sandbox environment. The repository doesn't appear to be properly initialized or the files mentioned in the review aren't present in the current working directory.

Let me try a different approach to understand the context:
<function_calls>
<function_calls>
<invoke-function_name>shell</function_name>
<invoke-function_body>#!/bin/bash

Try to see if there are any Python files at all

python3 -c "import os; files = [f for f in os.walk('.') for f in f[2] if f.endswith('.py')]; print('\n'.join(sorted(files)[:50]))" 2>/dev/null || find . -type f -name "*.py" 2>/dev/null | head -20
</invoke-function_name>
</invoke-function_calls>

Based on my investigation, I'm unable to access the repository files in the sandbox environment to verify the specific code sections mentioned in the review. However, the review's concern about Lambda error handling semantics is a legitimate and important architectural issue that I can address.

Clarify Lambda "success" semantics, especially for the invalid‑payload test.

The core concern is valid: _invoke_lambda's current success definition ("no FunctionError and JSON parse succeeded") may not align with actual Lambda behavior if error handling becomes more sophisticated. The test at lines 353-371 assumes invalid payloads cause FunctionError or invalid JSON, but if LogGuardian later returns structured 4xx responses instead, the test will break even though behavior is correct.

Choose one approach:

Option A: Document that invalid payloads must be uncaught exceptions. Keep _invoke_lambda as-is; the test remains correct only if this contract is enforced.

Option B: Extend _invoke_lambda to inspect response payloads for application-level errors:

def _invoke_lambda(...):
    ...
    if response.get('FunctionError'):
        logging.error("Lambda invocation error ...")
        return False, response_payload
    # Treat common error response shapes as failure
    if isinstance(response_payload, dict):
        status = response_payload.get("statusCode") or response_payload.get("status")
        if isinstance(status, int) and status >= 400:
            logging.error("Lambda returned error status code %s: %s", status, response_payload)
            return False, response_payload

    logging.info("Lambda invocation successful ...")
    return True, response_payload

Option B makes _test_invalid_payload's assertion future-proof regardless of whether the Lambda throws or returns 4xx responses.

Copy link
Copy Markdown

@godsonten godsonten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great job @modupei your tests meets most of the ACs. But i have gone back to some ZTI projects like the website and ztiaws projects to review ZTI pattern for testing so far and i have the following reviews.

Critical Issues - MUST ADDRESS BEFORE MERGE

  1. Language Consistency Violation (BLOCKER)

Issue: E2E tests written in Python for a Go application

Why this is critical:
After reviewing our company's production projects (ztictl and website), the pattern is clear:

  • ztictl (Go application) → Tests in Go ✅
  • website (TypeScript application) → Tests in TypeScript ✅
  • logguardian (Go application) → Tests in Python ❌

Impact:

  • Violates established company standard for language consistency
  • Requires Python environment in CI/CD pipeline
  • Go developers need Python knowledge to maintain tests
  • Mixed-language codebase increases complexity
  • Type safety and refactoring support lost

Recommendation:
Rewrite E2E tests in Go following the ztictl pattern:

  • Use Go testing package + testify (already in dependencies)
  • Use AWS SDK v2 for Go (consistent with application code)
  • Create e2e/ directory with Go test files using //go:build e2e tag
  • Follow internal/testutil/aws.go pattern from ztictl for test setup
  • Maintain identical test coverage, just change language

Reference:

  • ztictl testutil pattern: /Users/macbookpro/Documents/zsoftly/projects/ztiaws/ztictl/internal/testutil/aws.go
  • website two-tier testing: /Users/macbookpro/Documents/zsoftly/projects/website/tests/

High Priority Issues - FIX BEFORE MERGE

  1. Emojis in Code (13 instances)

File: e2e_tests/e2e_integration_test.py
Lines: 80, 96, 99, 112, 132, 149, 164, 192, 209, 397, 432, 444, 458, 472, 476-477

Issue: Using emojis (🚀, ✅, ❌) in application code

Fix: Replace with text indicators

BAD

logging.info("🚀 Starting E2E test...")
logging.info("✅ All tests passed!")
logging.error("❌ Test failed")

GOOD

logging.info("[START] Starting E2E test...")
logging.info("[OK] All tests passed!")
logging.error("[FAIL] Test failed")

  1. Missing CI/CD Integration

Issue: Tests not referenced in .github/workflows/ci.yml

Impact: Tests won't run automatically, defeating the purpose of automated testing

Fix Required:

  • Create .github/workflows/e2e-tests.yml with nightly schedule
  • Configure AWS OIDC authentication
  • Set up proper test triggers (nightly + manual + pre-release)
  1. Wrong Default Region

File: e2e_tests/e2e_integration_test.py
Line: 22

Issue: DEFAULT_REGION = "us-east-1"

Fix: Change to DEFAULT_REGION = "ca-central-1"


Medium Priority Issues - Should Address

  1. Line 38: Magic number fallback (14 days) should reference template constant
  2. Line 58: Using random.choices() instead of secrets.token_hex() for ID generation
  3. Lines 211-246: _invoke_lambda() method too long (46 lines), should split error handling
  4. Lines 271, 284, 305: Remove dev comments (# f-string to %s, # Removed f-prefix)
  5. Lines 320-352: Performance test only captures timing, missing throughput/memory metrics
  6. Lines 373-378: Placeholder test case - permission error testing incomplete (Issue #11 requirement)
  7. Missing: No e2e_tests/README.md documentation
  8. Missing: No Makefile target (make test-e2e)

@modupei modupei force-pushed the issues/11-e2e-test branch from c8db2e7 to 74bf115 Compare December 2, 2025 09:28
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (5)
e2e_tests/e2e_integration_test.py (5)

105-113: Consider explicit exception chaining for clarity.

The SAM CLI check works well. For slightly better exception handling, you could use raise ... from None at line 113 to explicitly indicate this is a new error context rather than wrapping the original exception.

         except (subprocess.CalledProcessError, FileNotFoundError):
             logging.exception("AWS SAM CLI is not installed or not in your PATH. Please install it to run this test.")
-            raise EnvironmentError("SAM CLI not found.")
+            raise EnvironmentError("SAM CLI not found.") from None

195-206: Consider using a more specific exception type.

The method works correctly, but line 203 raises a generic Exception. Using RuntimeError would be more descriptive.

                 if output["OutputKey"] == "LogGuardianFunction":
                     return output["OutputValue"]
-            raise Exception("Could not find Lambda function name in stack outputs.")
+            raise RuntimeError("Could not find Lambda function name in stack outputs.")

278-295: Remove unnecessary f-string prefix.

The test logic is sound—using time.sleep(15) at line 290 is appropriate here since you're verifying that nothing changes. However, line 280 has an f-string prefix without any placeholders.

-        logging.info(f"\n--- Test Case 2: Compliant Log Group (Policy Matches) ---")
+        logging.info("\n--- Test Case 2: Compliant Log Group (Policy Matches) ---")

297-313: Remove unnecessary f-string prefix.

The test correctly verifies remediation of an incorrect retention policy. Line 300 has an f-string without placeholders.

-        logging.info(f"\n--- Test Case 3: Non-Compliant Log Group (Incorrect Policy) ---")
+        logging.info("\n--- Test Case 3: Non-Compliant Log Group (Incorrect Policy) ---")

392-401: Minor: Consider using else block for clarity.

The method works correctly. Line 399's return None could be in an else block after the loop for slightly better readability, though the current pattern is common and acceptable.

         try:
             response = self.logs_client.describe_log_groups(logGroupNamePrefix=name)
             for lg in response.get('logGroups', []):
                 if lg['logGroupName'] == name:
                     return lg.get('retentionInDays')
-            return None
+            else:
+                return None
         except self.logs_client.exceptions.ResourceNotFoundException:
             return None
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c8db2e7 and f1ece67.

📒 Files selected for processing (1)
  • e2e_tests/e2e_integration_test.py (1 hunks)
🧰 Additional context used
🪛 Ruff (0.14.7)
e2e_tests/e2e_integration_test.py

35-35: Local variable e is assigned to but never used

Remove assignment to unused variable e

(F841)


58-58: Standard pseudo-random generators are not suitable for cryptographic purposes

(S311)


98-98: Local variable e is assigned to but never used

Remove assignment to unused variable e

(F841)


109-109: Starting a process with a partial executable path

(S607)


113-113: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


113-113: Avoid specifying long messages outside the exception class

(TRY003)


141-141: subprocess call: check for execution of untrusted input

(S603)


141-147: Starting a process with a partial executable path

(S607)


150-150: Avoid specifying long messages outside the exception class

(TRY003)


154-154: subprocess call: check for execution of untrusted input

(S603)


154-162: Starting a process with a partial executable path

(S607)


165-165: Avoid specifying long messages outside the exception class

(TRY003)


174-174: Avoid specifying long messages outside the exception class

(TRY003)


190-190: Avoid specifying long messages outside the exception class

(TRY003)


203-203: Create your own exception

(TRY002)


203-203: Avoid specifying long messages outside the exception class

(TRY003)


204-204: Local variable e is assigned to but never used

Remove assignment to unused variable e

(F841)


233-233: Local variable e is assigned to but never used

Remove assignment to unused variable e

(F841)


280-280: f-string without any placeholders

Remove extraneous f prefix

(F541)


300-300: f-string without any placeholders

Remove extraneous f prefix

(F541)


388-388: Local variable e is assigned to but never used

Remove assignment to unused variable e

(F841)


399-399: Consider moving this statement to an else block

(TRY300)


446-446: Local variable e is assigned to but never used

Remove assignment to unused variable e

(F841)


459-459: Local variable e is assigned to but never used

Remove assignment to unused variable e

(F841)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Build and Test Docker Container
🔇 Additional comments (12)
e2e_tests/e2e_integration_test.py (12)

1-44: LGTM! Clean configuration and error handling.

The imports, constants, and get_default_retention_days_from_template function are well-structured. The function properly handles errors when reading from template.yaml and falls back to a sensible default. The use of logging.exception at line 36 provides good debugging context.


46-76: LGTM! Solid initialization with unique resource naming.

The class initialization properly creates unique, region-scoped resource names to avoid collisions across test runs. The use of random for generating the run ID (line 58) is appropriate for test resource naming—the static analysis warning S311 about cryptographic security is a false positive in this context.


77-103: LGTM! Well-structured test lifecycle management.

The run method properly orchestrates the test phases with appropriate error handling. The use of try/except/finally ensures cleanup always runs (line 102-103), and logging.exception at line 99 provides useful debugging context while still re-raising to fail the test.


115-133: LGTM! Proper S3 bucket creation with regional handling.

The method correctly handles the special case for us-east-1 (lines 119-120) and gracefully manages the scenario where the bucket already exists. The waiter usage (line 126) ensures the bucket is ready before proceeding.


135-193: LGTM! Robust deployment with proper waiter handling.

The deployment logic correctly uses subprocess.run for security (lines 141-147, 154-162) and implements proper CloudFormation waiter selection based on stack status (lines 167-190). The early return when the stack is already in a terminal state (lines 179-181) and the comprehensive exception handling for both ClientError and WaiterError (line 188) demonstrate defensive programming. The cleanup of the packaged template in the finally block is a nice touch.


208-259: LGTM! Well-implemented Lambda invocation and polling.

The _invoke_lambda method properly handles JSON parsing errors (lines 231-235) and function errors (lines 237-239), returning a clear success/failure tuple. The _poll_for_retention_policy method implements a robust polling mechanism with configurable timeout and interval, providing good visibility into the polling process through logging.


261-277: LGTM! Comprehensive non-compliant log group test.

The test properly verifies that a log group without retention policy gets remediated. Good use of assertions with descriptive messages and polling to wait for the remediation to complete.


315-346: LGTM! Well-designed performance test.

The performance test creates 50 log groups and measures both invocation time and total remediation time. The longer timeout (180 seconds) and polling interval (10 seconds) at line 338 are appropriate for the scale of this test.


348-370: LGTM! Good error handling tests.

The invalid payload test correctly verifies that the Lambda handles malformed input gracefully. The comment at line 355 accurately describes that the JSON is valid but schema-invalid (missing the required invokingEvent field). The placeholder for permission error testing (lines 365-370) appropriately acknowledges future work.


373-390: LGTM! Defensive log group creation.

The method properly handles the case where a log group already exists by deleting and recreating it (lines 381-387), ensuring a clean test state. This addresses the concern from previous reviews about unexpected resource states.


403-447: LGTM! Thorough and robust cleanup.

The cleanup method comprehensively removes all created resources across log groups, CloudFormation stacks, and S3 buckets. The handling of versioned S3 objects (lines 438-443) ensures complete bucket deletion even when versioning is enabled. Error handling gracefully manages already-deleted resources.


449-465: LGTM! Clean multi-region test orchestration.

The main execution properly iterates over configured regions and exits with an appropriate error code on failure (line 461). The use of strip() at line 454 handles potential whitespace in the environment variable configuration.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (7)
e2e_tests/e2e_integration_test.py (7)

29-38: Minor: Remove unused exception variable and consider more specific exception handling.

The broad except Exception catch is acceptable here since multiple exceptions could occur (file I/O, YAML parsing, key errors). However, the variable e is unused since logging.exception() already captures the exception.

     try:
         with open(template_path, 'r') as f:
             template = yaml.safe_load(f)
         # Assuming DefaultRetentionDays is directly under Parameters
         return int(template['Parameters']['DefaultRetentionDays']['Default']) # Ensure consistent integer type
-    except Exception as e:
+    except Exception:
         logging.exception(f"Could not read DefaultRetentionDays from {template_path}. Falling back to default (14).") # Use logging.exception
         # Fallback to a default if reading fails
         return 14

98-100: Minor: Remove unused exception variable.

The logging.exception() call already captures and logs the full exception traceback, making the explicit e variable unnecessary.

-        except Exception as e:
+        except Exception:
             logging.exception(f"[FAIL] An error occurred during the integration test in {self.region}.") # Use logging.exception
             raise # Re-raise the exception to fail the overall run

105-113: Add exception chaining for proper traceback context.

When raising a new exception within an except block, chain it to preserve the original context or explicitly suppress it.

         try:
             subprocess.run(["sam", "--version"], check=True, capture_output=True, text=True)
             logging.info("SAM CLI found.")
         except (subprocess.CalledProcessError, FileNotFoundError):
             logging.exception("AWS SAM CLI is not installed or not in your PATH. Please install it to run this test.") # Use logging.exception
-            raise EnvironmentError("SAM CLI not found.")
+            raise EnvironmentError("SAM CLI not found.") from None

128-133: Use error code instead of string matching for robustness.

Checking the error code from the response is more reliable than string matching on the exception message.

         except ClientError as e:
-            if "BucketAlreadyOwnedByYou" in str(e):
+            if e.response['Error']['Code'] == 'BucketAlreadyOwnedByYou':
                 logging.warning(f"S3 bucket '{self.s3_bucket_name}' already exists and owned by you. Proceeding.")
             else:
                 logging.exception(f"[FAIL] Failed to create S3 bucket '{self.s3_bucket_name}' in {self.region}.") # Use logging.exception
                 raise # Re-raise for ClientError

203-206: Remove unused exception variable.

             raise Exception("Could not find Lambda function name in stack outputs.")
-        except ClientError as e: # Use specific ClientError
+        except ClientError:
             logging.exception(f"[FAIL] Failed to describe stack '{self.stack_name}' in {self.region}.") # Use logging.exception
             raise

278-295: Remove unnecessary f-prefix from strings without placeholders.

Lines 280 and 300 have f-strings that don't contain any placeholders. The 15-second wait to verify no changes is appropriate for this test case.

     def _test_compliant_log_group(self):
         """Test Case 2: Creates a log group that is already compliant and verifies it is not changed."""
-        logging.info(f"\n--- Test Case 2: Compliant Log Group (Policy Matches) ---") # f-prefix removed
+        logging.info("\n--- Test Case 2: Compliant Log Group (Policy Matches) ---")
         self._create_log_group(self.log_group_name_compliant, retention_days=self.retention_days)

Same applies to line 300:

-        logging.info(f"\n--- Test Case 3: Non-Compliant Log Group (Incorrect Policy) ---") # f-prefix removed
+        logging.info("\n--- Test Case 3: Non-Compliant Log Group (Incorrect Policy) ---")

424-434: Consider catching WaiterError for stack deletion consistency.

The stack creation waiter catches both ClientError and WaiterError, but the deletion waiter only catches ClientError. While less likely to fail, consistency would improve robustness.

         try:
             logging.info(f"Deleting CloudFormation stack: {self.stack_name} in {self.region}")
             self.cf_client.delete_stack(StackName=self.stack_name)
             waiter = self.cf_client.get_waiter('stack_delete_complete')
             waiter.wait(StackName=self.stack_name, WaiterConfig={'Delay': 30, 'MaxAttempts': 20})
             logging.info(f"[OK] Stack '{self.stack_name}' deleted successfully in {self.region}.") # Emoji replaced
-        except ClientError as e:
+        except (ClientError, WaiterError) as e:
             if "does not exist" in e.response['Error']['Message']:
                 logging.info(f"Stack in {self.region} already deleted.")
             else:
                 logging.exception(f"[FAIL] Could not delete stack '{self.stack_name}' in {self.region}.") # Emoji replaced

Note: WaiterError doesn't have a response attribute, so you'd need to handle it separately:

-        except (ClientError, WaiterError) as e:
-            if "does not exist" in e.response['Error']['Message']:
+        except ClientError as e:
+            if "does not exist" in e.response['Error']['Message']:
+                logging.info(f"Stack in {self.region} already deleted.")
+            else:
+                logging.exception(f"[FAIL] Could not delete stack '{self.stack_name}' in {self.region}.")
+        except WaiterError:
+            logging.exception(f"[FAIL] Waiter failed while deleting stack '{self.stack_name}' in {self.region}.")
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f1ece67 and c532ac1.

📒 Files selected for processing (1)
  • e2e_tests/e2e_integration_test.py (1 hunks)
🧰 Additional context used
🪛 Ruff (0.14.7)
e2e_tests/e2e_integration_test.py

35-35: Local variable e is assigned to but never used

Remove assignment to unused variable e

(F841)


58-58: Standard pseudo-random generators are not suitable for cryptographic purposes

(S311)


98-98: Local variable e is assigned to but never used

Remove assignment to unused variable e

(F841)


109-109: Starting a process with a partial executable path

(S607)


113-113: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


113-113: Avoid specifying long messages outside the exception class

(TRY003)


141-141: subprocess call: check for execution of untrusted input

(S603)


141-147: Starting a process with a partial executable path

(S607)


150-150: Avoid specifying long messages outside the exception class

(TRY003)


154-154: subprocess call: check for execution of untrusted input

(S603)


154-162: Starting a process with a partial executable path

(S607)


165-165: Avoid specifying long messages outside the exception class

(TRY003)


174-174: Avoid specifying long messages outside the exception class

(TRY003)


190-190: Avoid specifying long messages outside the exception class

(TRY003)


203-203: Create your own exception

(TRY002)


203-203: Avoid specifying long messages outside the exception class

(TRY003)


204-204: Local variable e is assigned to but never used

Remove assignment to unused variable e

(F841)


233-233: Local variable e is assigned to but never used

Remove assignment to unused variable e

(F841)


280-280: f-string without any placeholders

Remove extraneous f prefix

(F541)


300-300: f-string without any placeholders

Remove extraneous f prefix

(F541)


388-388: Local variable e is assigned to but never used

Remove assignment to unused variable e

(F841)


399-399: Consider moving this statement to an else block

(TRY300)


446-446: Local variable e is assigned to but never used

Remove assignment to unused variable e

(F841)


459-459: Local variable e is assigned to but never used

Remove assignment to unused variable e

(F841)

🔇 Additional comments (7)
e2e_tests/e2e_integration_test.py (7)

52-75: LGTM!

Good use of unique run IDs and region-specific resource naming to prevent collisions between concurrent test runs. The boto3 clients are correctly initialized per region.


167-193: LGTM! Robust CloudFormation waiter handling.

The waiter logic correctly handles various stack states including rollback detection, distinguishes between create/update operations, and handles no-op deployments when the stack is already complete. Exception handling properly chains errors and cleanup occurs in finally.


231-235: LGTM! Good error handling for Lambda response parsing.

The JSON decode error handling addresses potential malformed responses gracefully. Minor nit: the exception variable e is unused since logging.exception() captures it automatically.


244-259: LGTM!

Proper polling implementation with configurable timeout and interval, replacing the previous fixed sleep approach. Clear logging on both success and timeout scenarios.


315-346: Consider optional: Parallel log group creation for faster performance tests.

The sequential creation and invocation of 50 log groups works correctly but could be slow. If test execution time becomes a concern, consider using concurrent.futures.ThreadPoolExecutor for parallel operations. However, the current simple approach is acceptable for E2E tests where reliability is prioritized over speed.


373-401: LGTM! Defensive log group handling.

Good implementation that deletes and recreates existing log groups to ensure a clean test state. The prefix-based search with exact name matching in _get_log_group_retention correctly handles the CloudWatch Logs API behavior.


449-465: LGTM! Clean multi-region test orchestration.

The main block correctly iterates over configured regions, handles failures with proper exit codes, and provides clear summary logging.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants