Deep Learning Engineer · Research-driven · Builder of AI-powered products
Turning state-of-the-art research into real-world impact
I build production-grade LLM systems (fine-tuning, evaluation, tool-calling agents, RAG, monitoring), edge autonomy stacks (VLA loops, deterministic guardrails, PX4/MAVSDK), multimodal generative models (diffusion, audio-conditioned motion), and hardware-aware ML (edge benchmarking, constrained deployment).
| Area | Evidence | Metrics |
|---|---|---|
| 🛩️ Edge Autonomy | Edge-VLA-Micro VLA stack | PX4 SITL · Qwen2-VL + MAVSDK · HSV/Pydantic guardrails · 65.6% TTFT reduction |
| 🧠 Medical AI Research | SSL for computational pathology | 30/30 with honors · preliminary phase completed · paper writing and pathology collaboration in progress |
| 🏭 Industrial LLM Systems | Simplex Rapid production workflows | ~53% touchless rate · ~76% cost reduction · 20 languages |
| 🤖 Generative Robotics | Audio-conditioned humanoid motion diffusion | DDPM Transformer · SMPL/AIST++ · TSI 12.60 → 0.08 · biomechanical evaluation |
| 🏁 Ranked Hackathon | HackUPC / Skyscanner | 4th / 150+ teams |
| 🏁 Ranked Hackathon | GenAI.Works | 7th / 4,500 participants |
Production AI for precision mechanics: translation, document automation, structured extraction, and internal workflow agents.
| System | Scope | Outcome |
|---|---|---|
| Technical Translation | GPT-4o domain adaptation, controlled terminology, QA sampling | 20 languages ~53% touchless ~71% effort reduction |
| Cost Optimization | Dataset iteration, model selection, latency/cost trade-offs | ~76% estimated cost reduction |
| Agentic Business Assistant | OpenAI Responses API, structured JSON, tool calling, tracing | Auditable automation with human escalation |
| Technical Drawing Extraction | LLM + Vision pipeline for spring drawings/PDFs | Per-field validation, manufacturing-parameter population |
| Cam1/Cam2 Control Prototype | Embedded vision + RL policy evaluation | Run-to-run correction for cutting-length variance |
Self-supervised learning for computational pathology. The preliminary academic phase is complete; the paper writing and follow-up work with pathologists are in progress. Code, metrics, and implementation details remain private until the publication.
| Dimension | Status |
|---|---|
| Research area | SOTA Self-supervised learning model · medical image analysis |
| Academic outcome | 30/30 with honors |
| Publication status | Preliminary phase completed · manuscript and pathology collaboration in progress |
| Source code / dataset / artifacts | Private — withheld until publication path and IP review are mature |
Audio-conditioned whole-body humanoid trajectory synthesis with DDPM diffusion transformers over 24-joint SMPL pose sequences. The project is built as a research-grade pipeline around AIST++ motion/audio data, Temporal Cross-Attention, classifier-free guidance, EMA inference, and automated biomechanical diagnostics.
| Dimension | Implementation |
|---|---|
| Generative Model | DDPM Transformer over 72D SMPL axis-angle motion tokens |
| Multimodal Conditioning | Audio feature alignment · beat/chroma/tempo features · Temporal Cross-Attention |
| Robotics Reliability | TSI, JLVR, BAS, self-collision risk, failure-case mining |
| Optimized Result | TSI 12.60 → 0.08 · JLVR 6.0% · BAS 0.19 · self-collision 0.0004 |
| Repository | Humanoid-Motion-Diffusion |
Language-conditioned localization of functional regions on 3D objects for robotics-oriented perception.
| Dimension | Implementation |
|---|---|
| Vision-Language | CLIP-style embeddings · prompt-conditioned region scoring |
| 3D Pipeline | Differentiable rendering inspired by 3D Highlighter (CVPR 2023) |
| Robotics Relevance | Zero-shot affordance localization for interaction-aware reasoning |
| Repository | Affordance_Highlighting_Project_2024 |
| Project | Category | Signal |
|---|---|---|
| 🛩️ Edge-VLA-Micro | Edge autonomy / VLA | PX4 SITL · Qwen2-VL + MLX-VLM · MAVSDK/PX4 · HSV/Pydantic safety guardrails · blackbox telemetry |
| 🔤 gpt-tokenizer | Low-level LLM tooling | Byte-level BPE · Python/C parity · deterministic streaming |
| 🤖 Humanoid-Motion-Diffusion | Generative robotics | Audio-conditioned DDPM Transformer · SMPL/AIST++ · biomechanical validation |
| ⚙️ embedded-vision-tradeoffs-m7 | Edge AI benchmarking | Cortex-M7 · INT8 robustness · RAM/Flash/latency profiling |
| 🛰️ TinyHack2025 / MuseINO | Edge AI prototype | 24h Arduino Nicla Vision MVP · privacy-preserving attention analytics |
| 🧑💻 ZurichHackathon2025 / MAAS | Multi-agent systems | 36h dialog-to-action · Apertus-8B · judge model · disagreement gate |
| Product prototyping | 36h MVP · Skyscanner challenge · 4th / 150+ international teams | |
| 📰 PostGenius | RAG application | FastAPI · OpenAI · Vectara · Runway · 7th / 4,500 participants |
| 🧰 mlops_finetuning_framework | LLMOps template | Data prep · fine-tuning · evaluation · deployment hygiene |
| 🤖 BisiAgent007 | Agentic tooling | RAG · semantic search · tool-use · automated code-edit loop |
| 🎮 quoridor | Low-level | C + ARM Assembly · systems signal |


