Skip to content

anunjinb/Senior-Design

Repository files navigation

PROJECT DOCUMENTATION: APEX SYSTEMOS INTELLIGENCE Enterprise Bug Triage and Severity Prediction Engine

  1. PRODUCT OVERVIEW

Apex SystemOS is a high-performance machine learning platform engineered to automate the triage of software vulnerabilities. By leveraging a historical training foundation of over 222,000 Mozilla Firefox bug records, the system provides sub-second severity predictions and duplicate detection to optimize engineering workflows.

  1. THE CORE INTELLIGENCE ARCHITECTURE: THE RETRAINING LOOP

The platform has evolved from a static predictive model into a dynamic "Human-in-the-Loop" ecosystem. This ensures that the AI matures alongside the specific domain expertise of the engineering team.

A. Manual Analysis & Human-Correction Ingestion

Logic: When an engineer overrides a system-generated prediction in the Analysis module, the system captures the "Ground Truth."

System Impact: These corrections are saved to a dedicated feedback telemetry table. The Performance dashboard immediately recalibrates the global Accuracy and Correction Rate metrics.

B. High-Volume Batch Ingestion

Logic: The Submit module facilitates the bulk upload of JSON/CSV datasets for retroactive analysis.

System Impact: The engine performs a mass comparison between predicted and actual severities, instantly populating the Confusion Matrix and identifying statistical drift in the model.

C. Dynamic Performance Telemetry

Logic: The Performance dashboard has been re-engineered to move away from static fallbacks to live SQL aggregation.

System Impact: Training volume is calculated as the sum of baseline historical data plus live user feedback, ensuring stakeholders see a transparent, real-time representation of system knowledge.

  1. TECHNICAL SPECIFICATIONS

The current build reflects a transition from a visual prototype to a distributed data-driven application.

Backend Environment: FastAPI (Python 3.10+) utilizing asynchronous request handling.

Database Infrastructure: Supabase PostgreSQL with Row-Level Security (RLS) for multi-tenant data isolation.

Machine Learning: Random Forest Classifier (100 Estimators) with N-gram processing for technical linguistic analysis.

Storage Logic: Dedicated company-id partitioning to ensure strict data privacy and security compliance.

Search Engine: Vectorized semantic search for identifying duplicate bug reports across large-scale repositories.

  1. DATA PIPELINE FLOW

INGESTION: Data is received via the upload_and_train endpoint.

EVALUATION: The backend compares predicted_severity against actual_severity.

TELEMETRY: Metrics are aggregated in real-time via the get_ml_metrics API.

LOCALIZATION: System timestamps are automatically converted from UTC to local system time to ensure accurate audit trails.

  1. DEMONSTRATION & SYSTEM VALIDATION

For a comprehensive overview of the system in a production environment, please refer to the attached Demo video.mov. This video demonstrates:

Live bulk data ingestion and processing.

Real-time update of the Confusion Matrix.

Integration of user corrections into the performance telemetry.

  1. PROJECT MILESTONES (DEC 2025 - MAR 2026)

Q4 2025: Architecture design and database schema migration (PostgreSQL).

Q1 2026: Implementation of the Dynamic Metrics API and JWT security protocols.

Q1 2026 (Current): Deployment of the live retraining loop and real-time telemetry dashboard

About

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors