trying to eke out the most from every cpu cycle possible, seeing the hardware not as an abstraction, but as a key in execution.
To do this I focus on:
-
Memory Hierarchy: Designing for L1/L2 cache residency. I utilize Data-Oriented Design (DOD) and strict 64-byte cache-line alignment to eliminate misses and false sharing. -
CPU Pipeline: Optimizing for the Hardware Prefetcher via linear access patterns. I minimize pipeline stalls through branchless programming, bit-manipulation, and std::intrinsics to keep the Branch Predictor saturated. -
Execution concurrency: Maximizing ILP (Instruction Level Parallelism) and Out-of-Order execution by breaking data dependencies. I leverage SIMD and inline assembly when the compiler reaches its limit. -
Zero-Cost Resource Management: Eliminating pointer indirection by prioritizing stack allocation and pre-allocated arenas over the heap.
"I only care if it is possible to improve, no matter the difficulty. Code is an Art! And at my very best, my goal is to write code that honors the most complex human invention ever, the microprocessor."
🦀 Rust ⚡ Performance Engineering The stack Ofc! Branchless logic! ⚙️ Risc-V Embedded systems 🧬 Evolving Code Parallelism even though it hurts
I am currently transitioning my development workflow to a private, air-gapped infrastructure to focus on high-assurance systems and advanced microarchitectural research. Moving forward, this profile will host deterministic performance tools and academic projects related to my studies.
Focus areas for 2026: FPGA-acceleration, fetchable L1 cache-resident logic, and discrete mathematical models.
✉️ Email: hadrian.lazic@gmail.com
💼 LinkedIn: Hadrian Lazic



