Skip to content

Fix logging functions and add comprehensive error handling#81

Open
jessegoodier wants to merge 6 commits intomainfrom
fix/logging-functions
Open

Fix logging functions and add comprehensive error handling#81
jessegoodier wants to merge 6 commits intomainfrom
fix/logging-functions

Conversation

@jessegoodier
Copy link
Copy Markdown
Owner

Summary

This PR implements comprehensive fixes for logging functionality and adds configurable debug logging capabilities:

🔧 Kubernetes API Fixes

  • Fixed ApiTypeError: Got an unexpected keyword argument 'since_time' by using correct since_seconds parameter
  • Added timestamp to seconds conversion for log continuity tracking
  • Improved error handling for containers in startup states (ContainerCreating, waiting to start)
  • Reduced log noise from containers that aren't ready yet

📊 Comprehensive Log Continuity System

  • Session-based tracking: Prevents missing log lines between requests using UUID session management
  • Client-side buffering: Implements deduplication and overlap detection with Map-based log storage
  • Parallel container fetching: Fixed race conditions in multi-container log aggregation using ThreadPoolExecutor
  • Search functionality: Improved search to prevent window shifting during filtering
  • Monitoring endpoints: Added /api/log_continuity_stats and /api/log_health_check for diagnostics

🐛 Debug Logging Configuration

  • Helm chart integration: Added logging section to values.yaml with granular debug controls
  • Component-specific debugging: Separate debug flags for app, Kubernetes client, and log archiver
  • Environment variable configuration: Configurable log levels (DEBUG, INFO, WARNING, ERROR, CRITICAL)
  • Improved log formatting: Added timestamps, module names, and structured logging

🧪 Test Infrastructure

  • Fixed all failing Playwright tests by resolving namespace mismatches and hardcoded expectations
  • Updated test deployment configurations for proper pod selection
  • Ensured tests pass consistently across different cluster configurations

Usage Examples

Enable debug logging:

helm upgrade logpilot ./charts/logpilot \
  --set logging.level=DEBUG \
  --set logging.debug.kubernetes=true

Monitor log continuity:

curl http://localhost:5001/api/log_continuity_stats
curl http://localhost:5001/api/log_health_check

Test Plan

  • All Playwright tests pass (69 passed, 8 skipped)
  • No Kubernetes API errors in deployment logs
  • Debug logging toggles work correctly
  • Log continuity tracking prevents missing lines
  • Multi-container pods work without race conditions
  • Search functionality preserves log continuity
  • Error handling gracefully manages container startup states

Breaking Changes

None - all changes are backward compatible.

Technical Details

  • Session Management: Thread-safe session cache with automatic cleanup
  • API Compatibility: Uses correct Kubernetes API parameters (since_seconds vs since_time)
  • Error Handling: Distinguishes between permanent errors and temporary container states
  • Performance: Parallel log fetching reduces timing skew between containers
  • Memory Management: Deduplication prevents memory leaks from duplicate log entries

🤖 Generated with Claude Code

jessegoodier and others added 6 commits June 18, 2025 22:09
Major improvements:
- Add timestamp-based log continuity with session tracking
- Implement client-side log buffering with deduplication
- Fix race conditions in multi-container log aggregation with parallel fetching
- Improve search functionality to prevent window shifting
- Add comprehensive monitoring endpoints and enhanced logging

Backend changes:
- Thread-safe session management with proper locking
- Parallel container log fetching using ThreadPoolExecutor
- Enhanced error handling with retry suggestions
- New endpoints: /api/log_continuity_stats, /api/log_health_check
- Unique line IDs for robust deduplication

Frontend changes:
- Client-side log buffer with Map-based storage
- Session continuity support with automatic session ID management
- Reset log session functionality for debugging
- Enhanced error handling and user feedback

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
- Fixed string quotes consistency (single to double quotes)
- Improved function parameter alignment and wrapping
- Enhanced dictionary and list formatting for readability
- Standardized import statement formatting
- Improved code structure and spacing consistency

This commit applies automatic formatting changes made by ruff
during the test execution process to ensure code consistency
across the project.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
- Replace invalid 'since_time' parameter with 'since_seconds'
- Add timestamp to seconds conversion for log continuity
- Improve error handling for containers not ready yet
- Reduce log noise for containers in 'ContainerCreating' state
- Add better error messages for container startup states

These changes resolve the ApiTypeError exceptions and reduce
noise from containers that are still starting up.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
- Add logging configuration section to values.yaml
- Support for setting global log level (DEBUG, INFO, WARNING, ERROR, CRITICAL)
- Granular debug logging controls for specific components:
  * app: Main application debug logging
  * kubernetes: Kubernetes client API debug logging
  * archiver: Log archival system debug logging
- Environment variables passed to deployment for logging config
- Improved log formatting with timestamps and module names
- Updated log_archiver.py to use proper logger instance

Example usage:
  helm upgrade logpilot ./charts/logpilot \
    --set logging.level=DEBUG \
    --set logging.debug.kubernetes=true

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
- Update chart source files to match main src/ directory
- Includes Kubernetes API fixes and debug logging configuration
- Ensures Helm deployments use the latest code

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant