I'm Sushil Dalavi, an AI Engineer at the USC Annenberg Norman Lear Center and an MS in Computer Science candidate at USC (2024 – 2026).
I architect production AI systems — AWS data platforms, hybrid retrieval pipelines, distributed LLM workflows, and multi-modal ML — with an emphasis on measurable outcomes, reliability, and reproducibility.
| 💼 | Open to SDE / SWE / AI·ML Engineer / Applied AI roles |
| 🏗️ | AWS data platforms, distributed workflows, LLM inference gateways |
| 🧠 | Hybrid retrieval, reranking, MLOps, multi-modal alignment |
| 📚 | Building JobSense, ScribeAI, and ScholarRAG |
| 🌍 | Motivated by real-world product impact |
| ⚽ | Proud Real Madrid supporter |
| 🍥 | Huge anime geek |
University of Southern California MS in Computer Science 📍 Los Angeles, CA | 📅 Aug 2024 – May 2026 |
University of Mumbai BE in Computer Engineering 📍 Mumbai, India | 📅 Jun 2019 – May 2023 |
USC Annenberg Norman Lear Center AI Engineer 📍 Los Angeles, CA | 📅 Jun 2025 – Present |
Reliance Jio Platforms Software Engineer 📍 Navi Mumbai, India | 📅 Dec 2023 – Jul 2024 |
📌 Highlights from USC Annenberg Norman Lear Center
- Architected an AWS data platform (S3, Glue, SageMaker, Bedrock) ingesting, deduplicating, and normalizing 1M+ multi-region records for downstream ML training and retrieval workloads.
- Shipped a multi-modal alignment system fusing audio, speaker diarization, and caption streams — reaching 99.3% F1 and 99.9% coverage on ground-truth evaluation.
- Developed large-scale batch pipelines processing long-form video and audio through Whisper ASR, pyannote diarization, and model-based refinement stages.
- Automated dataset QA, Unicode normalization, and deduplication in Python — lifting analysis-ready yield from 10,819 → 9,735 records with full reproducibility.
📌 Highlights from Reliance Jio Platforms
- Trained and deployed ResNet-50 and DenseNet-121 deep vision networks for medical image anomaly detection — improving recall by 35% via transfer learning, augmentation, and loss tuning.
- Optimized quantized transformer inference (BERT, GPT-2) on GPU with batched serving — cutting p95 latency by 30% while preserving accuracy gains.
- Engineered demand-forecasting microservices (TFT, CatBoost, LSTM) over Hive SQL batch pipelines, reducing forecast MAPE by 25% for business-critical workloads.
- Rolled out shadow-testing and canary-release workflows for 3 production ML upgrades, catching 2 latency regressions before fleet-wide deployment.
🧭 JobSenseDurable distributed workflow platform — a fault-tolerant orchestration system on Temporal with 12 tool integrations, human-in-the-loop checkpoints, and a provider-agnostic inference gateway. Highlights
Stack |
✍️ ScribeAIInference service with evaluation pipeline — async FastAPI service with SSE streaming, multi-backend routing (GPT-4o, Claude, fallback), and an MLflow-tracked evaluation harness. Highlights
Stack |
|
Retrieval and data engineering system — a hybrid retrieval pipeline for scholarly discovery with citation-aware grounding. Highlights
Stack |
🏥 MedSOAPClinical documentation automation — generates structured SOAP notes from doctor-patient conversations. Highlights
Stack |
| 🎬 | Love webseries and serious binge watching | 🏊 | Swimming keeps me grounded |
| 🏓 | Enjoy table tennis | ⚽ | Lifelong football fan |
| 🍥 | Huge anime geek | 🎧 | Music always around |
A proud Real Madrid supporter — I love the mentality, the standards, the legacy, and the winning culture.
I'm especially interested in opportunities where strong software engineering meets AI/ML, backend systems, and data-driven product building.






