You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+53-10Lines changed: 53 additions & 10 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,9 +1,19 @@
1
1
# C++ Refresh
2
2
3
-
Brush up on C++ personally to be a real-time engineer in AI era. For fundamental jargon you must know, please visit [jargon.md](./docs/jargon.md).
3
+
I'm currently brushing up on my C++ skills to prepare for a career as a real-time engineer in the age of AI. Check out [jargon.md] for the fundamental terms you'll need to know.
@@ -26,13 +36,13 @@ Brush up on C++ personally to be a real-time engineer in AI era. For fundamental
26
36
│ │ (Shared, MBs) │ │ Bigger and slower than L2
27
37
│ └────────────────────────┘ │
28
38
└──────────────┬───────────────┘
29
-
│
30
-
Local Memory Controller
31
-
│
39
+
│
40
+
Local Memory Controller
41
+
│
32
42
┌──────────────────┴──────────────────┐
33
43
│ │
34
-
RAM (NUMA Node 0) RAM (NUMA Node 1)
35
-
~80ns latency ~150ns latency
44
+
RAM (NUMA Node 0) RAM (NUMA Node 1)
45
+
~80ns latency ~150ns latency
36
46
```
37
47
38
48
How latency can grow...
@@ -47,15 +57,20 @@ Instruction →
47
57
miss → RAM (NUMA remote)
48
58
```
49
59
60
+
</details>
61
+
50
62
## Topics
51
63
52
-
[day01](./src/cpp/day01/): stack vs heap
53
-
[day02](./src/cpp/day02/): reference vs copy
54
-
[day03](./src/cpp/day03/): elide vs move vs copy
55
-
[day04](./src/cpp/day04/): STL Containers & API Design
64
+
- [day01](./src/cpp/day01/): stack vs heap
65
+
- [day02](./src/cpp/day02/): reference vs copy
66
+
- [day03](./src/cpp/day03/): elide vs move vs copy
67
+
- [day04](./src/cpp/day04/): STL Containers & API Design
56
68
57
69
## Allocators & Cache Behavior (Day 05)
58
70
71
+
<details>
72
+
<summary> Click to expand/collapse </summary>
73
+
59
74
### What an allocator actually is
60
75
61
76
An `allocator` answers two questons:
@@ -113,3 +128,31 @@ struct Good {
113
128
```
114
129
115
130
Group hot data together.
131
+
132
+
</details>
133
+
134
+
## Practical Application
135
+
136
+
Based on my experience, fine-tuning pre-trained AI models for specific applications is becoming increasingly straightforward, thanks to the optimization of inference frameworks, particularly on GPUs.
137
+
138
+
As model inference becomes faster and more efficient, the true bottleneck often shifts to data flow and real-time decision-making within the Python-based container. Python's inherent inefficiencies in the post-processing layer, especially on hot paths, can significantly hinder performance.
139
+
140
+
In AI-heavy applications—such as video streaming, autonomous driving, smart cities, and trading—optimizing everything beyond inference is critical. C++ plays a key role in eliminating these inefficiencies and squeezing out those final milliseconds. Therefore, I will simulate the Python hot path for processing object detection metadata and re-implement it in C++ to achieve the performance gains needed for real-time applications.
141
+
142
+
The directory [./src/python/yolo/inference](./src/python/yolo/inference/) will simulate Ultralytics YOLO's inference and generate dummy metadata, which is used in the analytics layer (In real-world applications, this metadata loop is handled by NVIDIA DeepStream, which is highly optimized).
143
+
144
+
The Python implementation will serve as a reference forthe analytics pipeline, which I will later re-implementin C++ to optimize the performance of time-critical operations in real-time systems.
145
+
146
+
```shell
147
+
# Inference container is just to simulate metadata generation in quicik.
148
+
# In real world applications, I used the highly optimized NVIDIA DeepStream.
0 commit comments