rxi.github.io/cached_software_rendering.html at master · rxi/rxi.github.io · GitHub

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
<!DOCTYPE html>
<html lang="en">
  <head>
    <title>Cached Software Rendering | rxi</title>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=0.45">
    <link rel="stylesheet" type="text/css" href="style.css">
  </head>
  <body>
    <a class="logo" href="index.html"></a>
    <h1 class="title">Cached Software Rendering</h1>
<div class="date">2020.04.02</div>
<div class="updated">updated: 2020.04.03</div>
<div class="sections"><div class="section_item"><a href="#introduction">Introduction</a></div><div class="section_item"><a href="#overview">Overview</a></div><div class="section_item"><a href="#command_buffer">Command Buffer</a></div><div class="section_item"><a href="#hash_grid">Hash Grid</a></div><div class="section_item"><a href="#rendering">Rendering</a></div><div class="section_item"><a href="#conclusion">Conclusion</a></div></div>

<h2 id="introduction" class="section_header">Introduction</h2>
<p>
  This write-up details an approach to 2d software rendering useful for typical
  application UIs where between most frames little-or-none of the screen&#39;s
  content changes. This approach is used in the
  <a class="link" href="https://github.com/rxi/lite">lite</a> text editor.
</p>

<p>
  The approach provides the performance advantages of dirty rectangles but
  abstracts any of the mental overhead or code complexity usually associated
  with them — the application&#39;s code can act as if it is redrawing everything
  each frame while the renderer takes care of only redrawing regions of the
  screen which have changed. This leads to <em>simpler</em>, less error prone
  application code without the waste of CPU time which would result from
  continually redrawing.
</p>

<h2 id="overview" class="section_header">Overview</h2>
<p>
  The approach exists in 3 parts: the <code>Command Buffer</code>, <code>Hash Grid</code> and
  <code>Renderer</code>. For each frame where the application would typically draw, it
  instead pushes &quot;draw commands&quot; to a command buffer; at the end of the frame the
  command buffer is iterated and the commands themselves are added to the hash
  grid. Finally each cell of the hash grid is compared with the previous frame&#39;s
  and the regions for the cells which have changed are redrawn.
</p>

<h2 id="command_buffer" class="section_header">Command Buffer</h2>
<p>
  The command buffer is a simple byte array which each draw command is pushed to:
</p>
<pre class="codeblock">
char command_buf[1024*1024];
int  command_buf_idx;
</pre>

<p>
  A draw command must contain a rectangle representing the area of the screen
  which the draw operation would effect when performed (this is used later by the
  hash grid), as well as any additional information required to draw the item (eg.
  an image pointer, color, or string for text rendering). A base command at
  minimum might be represented by the following struct:
</p>
<pre class="codeblock">
typedef struct { int x, y, w, h; } Rect;

typedef struct {
  int type, size;
  Rect rect;
} Command;
</pre>


<p>
  This would be used as a base for any other commands which would extend it with
  additional information they require:
</p>
<pre class="codeblock">
typedef struct { unsigned char r, g, b, a; } Color;

typedef struct {
  Command cmd;
  Color color;
} RectCommand;
</pre>

<p>
  Whenever we call the <code>push_rectangle</code> function we would use the arguments
  passed it to fill the <code>RectCommand</code> struct and push it to the renderer&#39;s
  command buffer; no actual drawing would be done at this point:
</p>

<pre class="codeblock">
void push_rectangle(Rect r, Color clr) {
  RectCommand *c = (void*) &amp;command_buf[command_buf_idx];
  c-&gt;cmd.type = RECT_COMMAND;
  c-&gt;cmd.size = sizeof(*c);
  c-&gt;cmd.rect = r;
  c-&gt;color = clr;
  command_buf_idx += c-&gt;cmd.size;
}
</pre>

<p>
  We would do the same for any other draw commands which occur in the frame.
</p>

<h2 id="hash_grid" class="section_header">Hash Grid</h2>
<p>
  The <code>Hash Grid</code> is a 2d grid of cells where each cell is represented by a hash
  value <em>(a 32bit unsigned integer and simple hash function such as FNV-1a should
  suffice)</em> which is associated with an NxN pixel region of the screen. There must
  be enough cells to account for the whole screen, eg. a screen resolution of
  1920x1080 and cell size of 100x100 pixels would require a 20x11 grid of cells. A
  second hash grid of the same size is also stored as we&#39;ll need to compare the
  current hash grid with the one of the previous frame.
</p>
<pre class="codeblock">
unsigned  cell_buf1[CELLS_X * CELLS_Y];
unsigned  cell_buf2[CELLS_X * CELLS_Y];
unsigned *cells = cell_buf1;
unsigned *cells_prev = cell_buf2;
</pre>

<p>
  At the end of each frame when the command buffer has been filled by the
  application&#39;s code, we reset the hash grid to an initial state; that is,
  setting each cell&#39;s hash value to the initial value used by our hash function.
</p>
<pre class="codeblock">
for (int i = 0; i &lt; CELLS_X * CELLS_Y; i++) {
  cells[i] = HASH_INITIAL;
}
</pre>

<p>
  We then iterate the buffer of commands and for each one we hash the command
  itself. Each hash value for the cells of the hash grid which the command&#39;s
  <code>rect</code> overlaps would then be updated with the command&#39;s hash value:
</p>
<pre class="codeblock">
Command *cmd = (void*)  command_buf;
Command *end = (void*) &amp;command_buf[command_buf_idx];

while (cmd != end) {
  unsigned h = HASH_INITIAL;
  hash(&amp;h, cmd, cmd-&gt;size);
  update_overlapping_cells(cmd-&gt;rect, h);
  cmd = (void*) (((char*) cmd) + cmd-&gt;size);
}
</pre>

<p>
  With the implementation of <code>update_overlapping_cells</code>:
</p>
<pre class="codeblock">
void update_overlapping_cells(Rect r, unsigned h) {
  int x1 = r.x / CELL_SIZE;
  int y1 = r.y / CELL_SIZE;
  int x2 = (r.x + r.w) / CELL_SIZE;
  int y2 = (r.y + r.h) / CELL_SIZE;
  for (int y = y1; y &lt;= y2; y++) {
    for (int x = x1; x &lt;= x2; x++) {
      hash(&amp;cells[x + y * CELLS_X], &amp;h, sizeof(h));
    }
  }
}
</pre>

<h2 id="rendering" class="section_header">Rendering</h2>
<p>
  Once all the commands have been iterated the hash grid is complete. We then
  iterate each cell of the hash grid and compare it to the previous frame&#39;s value,
  if the cells differ we need to redraw the region of the screen that the cell
  represents.
</p>

<p>
  For each changed region we set our renderer to clip to that region and iterate
  the draw commands again, for every command where the rect overlaps that cell
  we perform the command&#39;s draw operation:
</p>
<pre class="codeblock">
for (int y = 0; y &lt; CELLS_Y; y++) {
  for (int x = 0; x &lt; CELLS_X; x++) {
    int idx = x + y * CELLS_X;
    if (cells[idx] != cells_prev[idx]) {
      Rect rect = { x * CELL_SIZE, y * CELL_SIZE, CELL_SIZE, CELL_SIZE };
      redraw_region(rect);
    }
  }
}
</pre>

<p>
  As a simple optimisation each changed cell shouldn&#39;t draw immediately, but
  instead an effort should be made to merge the rectangle regions for adjacent
  cells into larger rectangle regions which are then redrawn.
</p>

<p>
  Once we have finished redrawing the changed regions we swap the current frame&#39;s
  hash grid with the previous frame&#39;s (such that next frame we would be comparing
  with this frame&#39;s state) and reset the command buffer index to the start of the
  buffer:
</p>
<pre class="codeblock">
unsigned *tmp = cells_prev;
cells_prev = cells;
cells = tmp;
command_buf_idx = 0;
</pre>

<p>
  Although the approach is aimed towards software rendering, it is not tied to it
  specifically and can be used to minimize redrawing when using hardware
  rendering; though it&#39;s questionable in reality how many situations would be
  improved by the approach when using hardware rendering.
</p>

<h2 id="conclusion" class="section_header">Conclusion</h2>
<p>
  The result of the approach means that rendering each frame is reduced to
  pushing a series of commands to a linear buffer and performing a fast hash
  function. For a typical UI most frames will result in nothing being redrawn,
  and most redraws will do so to only a small portion of the screen. Consider
  the difference in redrawing a transparent 512x512 rectangle which would
  require blending 262,144 pixels (1,048,576 bytes), versus hashing 180 bytes of
  data (the 40 byte command and 36 hash grid cells; <em>note: the number of
  cells can be reduced by increasing the cell size if you&#39;re typically drawing
  larger elements</em>). The more costly the drawing operation would have been the
  more benefit the approach yields.
</p>


  </body>
</html>