Skip to content

Optimize command formatting and representation#162

Merged
zuiderkwast merged 3 commits intoEricsson:mainfrom
zuiderkwast:bin-append-command-formatting
Apr 1, 2026
Merged

Optimize command formatting and representation#162
zuiderkwast merged 3 commits intoEricsson:mainfrom
zuiderkwast:bin-append-command-formatting

Conversation

@zuiderkwast
Copy link
Copy Markdown
Collaborator

@zuiderkwast zuiderkwast commented Mar 19, 2026

This PR optimizes how commands are formatted and represented internally, reducing overhead in the hot path (command formatting → cross-process copy → TCP send).

Changes:

  1. Use binary append instead of iolist_to_binary in command_to_bin
  2. Flatten pipeline to single binary with precomputed response class

Build the RESP-encoded command pipeline (batch) as a single binary by appending directly (<<Acc/binary, ...>>) instean of building one binary per command in the pipeline. Also don't build it via an intermediate iolist and converting it with iolist_to_binary. This makes cross-process copies cheaper (one refc binary instead of a list of N binaries + N cons cells).

Precompute response classes (information about each command's expected replies) at convert_to time and store them in the tuple command representation tuple.

When an ASK redirect is received, split the pipeline RESP binary into individual commands so we can interleave ASKING commands. ASK redirects are expected to be rare so it's fine that this is not optimal.

Changed type redis_command():

{redis_command, {single, response_class()}, binary()}
{redis_command, {pipeline, [response_class()]}, binary()}

Benchmark results (convert_to + cross-process copy, ns/op, OTP 28):

Command main change 1: bin-append change 2: flatten pipeline
single small (37B) 750 723 (-4%) 786 (+5%)
pipeline 10 (1.1KB) 3076 2815 (-8%) 1899 (-38%)
large real 7×5KB 3469 3740 (+8%) 2188 (-37%)

Change 1 (bin-append) is roughly neutral — small win on single/pipeline, small loss on large.

Change 2 (flatten pipeline) is where the big gains are: ~38% faster for pipelines. The single binary is much cheaper to copy between processes than a list of binaries.

Build the RESP binary by appending directly instead of constructing
an iolist and converting it. This avoids the intermediate iolist
allocation and is faster for the full data path (cross-process copy
+ TCP send) which is the common case.

Benchmark results (cross-process copy + TCP send, OTP 28):

                    iolist_to_binary  iolist    bin-append
  small (37B)       2105 ns/op        1128      888
  pipeline (1.1KB)  2492 ns/op        2930     1226
Change redis_command() representation from a list of per-command
binaries to a single concatenated binary. Response classes are
precomputed at convert_to time and stored in the tuple, making
get_data and get_response_class simple field accesses.

New type:
  {redis_command, {single, response_class()}, binary()}
  {redis_command, {pipeline, [response_class()]}, binary()}

Also simplify resp_class to work on raw commands instead of
RESP-formatted binaries, and preserve the original response
class in add_asking (needed for sharded pub/sub commands like
SSUBSCRIBE that can receive ASK redirects).

Benchmark results (convert_to + cross-process copy + TCP send, OTP 28):

                    before        after
  single small      2053 ns/op    2372
  pipeline 10       3584 ns/op    2712
  large real        3823 ns/op    3050
@zuiderkwast zuiderkwast marked this pull request as ready for review March 20, 2026 13:01
@zuiderkwast
Copy link
Copy Markdown
Collaborator Author

The failing tests are flaky tests, most of which will be fixed in the feature branch for single-process client.

@WilliamVoong @bjosv PTAL

@zuiderkwast zuiderkwast requested a review from bjosv March 20, 2026 13:12
@zuiderkwast
Copy link
Copy Markdown
Collaborator Author

zuiderkwast commented Mar 20, 2026

The benchmarks were run using some code directly on the command line, like this:

rebar3 shell --eval '
Small = [<<"SET">>, <<"mykey">>, <<"myvalue">>],
Medium = [[<<"HSET">>, <<"hash:key">>, <<"field">>, crypto:strong_rand_bytes(64)] || _ <- lists:seq(1, 10)],
Large = amf_valkey_data:commands(),
N = 1000000,
Bench = fun(Label, Nn, Cmd) ->
    Recv = spawn_link(fun() -> (fun F(0) -> receive {done,P} -> P ! ok end; F(I) -> receive {cmd,_} -> F(I-1) end end)(Nn) end),
    T0 = erlang:monotonic_time(microsecond),
    lists:foreach(fun(_) -> C = ered_command:convert_to(Cmd), Recv ! {cmd, C} end, lists:seq(1,Nn)),
    Recv ! {done, self()}, receive ok -> ok end,
    T1 = erlang:monotonic_time(microsecond),
    io:format("~-20s ~.1f~n", [Label, (T1-T0)*1000/Nn])
end,
io:format("=== commit 2: flatten pipeline (convert_to + copy) ===~n"),
Bench("single small", N, Small),
Bench("pipeline 10", N div 10, Medium),
Bench("large real 7x5KB", N div 50, Large),
halt().
' 2>&1 | grep -E "===|small|pipeline|large"

I have also a temporary module that does more-or-less the same:

src/ered_command_bench.erl
-module(ered_command_bench).
-export([run/0]).

run() ->
    Small = [<<"SET">>, <<"mykey">>, <<"myvalue">>],
    Medium = [[<<"HSET">>, <<"hash:key">>, <<"field">>, crypto:strong_rand_bytes(64)]
              || _ <- lists:seq(1, 10)],
    Large = try amf_valkey_data:commands()
            catch _:_ -> [[<<"MSET">> | [crypto:strong_rand_bytes(128) || _ <- lists:seq(1, 100)]]]
            end,
    BigVal = [<<"SET">>, <<"mykey">>, crypto:strong_rand_bytes(100000)],

    N = 500000,

    io:format("=== convert_to only ===~n"),
    bench_convert("single small", N, Small),
    bench_convert("pipeline 10", N div 10, Medium),
    bench_convert("large real", N div 50, Large),
    bench_convert("big value 100KB", N div 50, BigVal),

    io:format("~n=== convert_to: iolist_to_binary vs iolist vs bin-append ===~n"),
    bench_convert_variants("single small", N, Small),
    bench_convert_variants("pipeline 10", N div 10, Medium),
    bench_convert_variants("large real", N div 50, Large),
    bench_convert_variants("big value 100KB", N div 50, BigVal),

    io:format("~n=== convert_to + cross-process copy ===~n"),
    bench_copy("single small", N, Small),
    bench_copy("pipeline 10", N div 10, Medium),
    bench_copy("large real", N div 50, Large),
    bench_copy("big value 100KB", N div 50, BigVal),

    io:format("~n=== convert_to + cross-process copy + port send ===~n"),
    {ok, LSock} = gen_tcp:listen(0, [{active, false}, binary]),
    {ok, Port} = inet:port(LSock),
    spawn_link(fun() -> {ok, S} = gen_tcp:accept(LSock), drain(S) end),
    {ok, Sock} = gen_tcp:connect("localhost", Port, [binary, {active, false}]),
    bench_send("single small", N, Small, Sock),
    bench_send("pipeline 10", N div 10, Medium, Sock),
    bench_send("large real", N div 50, Large, Sock),
    bench_send("big value 100KB", N div 50, BigVal, Sock),
    gen_tcp:close(Sock),
    gen_tcp:close(LSock),

    io:format("~n=== binary vs iolist vs bin-append: cross-process copy + send ===~n"),
    bench_three_variants(N, Small, "small (37B)"),
    MedBin = ered_command:get_data(ered_command:convert_to(Medium)),
    MedSize = iolist_size(MedBin),
    bench_three_variants(N div 10, Medium, io_lib:format("pipeline 10 (~pB)", [MedSize])),
    ok.

%% --- convert_to only ---
bench_convert(Label, N, Cmd) ->
    _ = [ered_command:convert_to(Cmd) || _ <- lists:seq(1, 100)],
    T0 = erlang:monotonic_time(microsecond),
    repeat(fun() -> ered_command:convert_to(Cmd) end, N),
    T1 = erlang:monotonic_time(microsecond),
    io:format("~-20s ~.1f ns/op~n", [Label, (T1 - T0) * 1000 / N]).

bench_convert_variants(Label, N, Cmd) ->
    T0 = erlang:monotonic_time(microsecond),
    repeat(fun() -> ered_command:convert_to(Cmd) end, N),
    T1 = erlang:monotonic_time(microsecond),
    repeat(fun() -> make_iolist(Cmd) end, N),
    T2 = erlang:monotonic_time(microsecond),
    repeat(fun() -> make_bin_append(Cmd) end, N),
    T3 = erlang:monotonic_time(microsecond),
    io:format("~-20s iolist_to_binary: ~.1f  iolist: ~.1f  bin-append: ~.1f ns/op~n",
              [Label, (T1 - T0) * 1000 / N, (T2 - T1) * 1000 / N, (T3 - T2) * 1000 / N]).

%% --- convert_to + send as message to another process ---
bench_copy(Label, N, Cmd) ->
    Receiver = spawn_link(fun() -> sink(N) end),
    T0 = erlang:monotonic_time(microsecond),
    repeat(fun() ->
                   Converted = ered_command:convert_to(Cmd),
                   Receiver ! {cmd, Converted}
           end, N),
    Receiver ! {done, self()},
    receive ok -> ok end,
    T1 = erlang:monotonic_time(microsecond),
    io:format("~-20s ~.1f ns/op~n", [Label, (T1 - T0) * 1000 / N]).

%% --- convert_to + send message + tcp send (mimics full path) ---
bench_send(Label, N, Cmd, Sock) ->
    Sender = spawn_link(fun() -> tcp_sender(N, Sock) end),
    T0 = erlang:monotonic_time(microsecond),
    repeat(fun() ->
                   Converted = ered_command:convert_to(Cmd),
                   Data = ered_command:get_data(Converted),
                   Sender ! {send, Data}
           end, N),
    Sender ! {done, self()},
    receive ok -> ok end,
    T1 = erlang:monotonic_time(microsecond),
    io:format("~-20s ~.1f ns/op~n", [Label, (T1 - T0) * 1000 / N]).

%% --- Compare binary vs iolist vs bin-append for cross-process copy + tcp send ---
bench_three_variants(N, Cmd, Label) ->
    BinData = ered_command:get_data(ered_command:convert_to(Cmd)),
    IoListData = make_iolist(Cmd),
    BinAppendData = make_bin_append(Cmd),

    {ok, LSock} = gen_tcp:listen(0, [{active, false}, binary]),
    {ok, Port} = inet:port(LSock),
    Socks = [begin
                 spawn_link(fun() -> {ok, S} = gen_tcp:accept(LSock), drain(S) end),
                 {ok, S} = gen_tcp:connect("localhost", Port, [binary, {active, false}]),
                 S
             end || _ <- [1,2,3]],
    [Sock1, Sock2, Sock3] = Socks,

    T0 = bench_send_data(N, BinData, Sock1),
    T1 = bench_send_data(N, IoListData, Sock2),
    T2 = bench_send_data(N, BinAppendData, Sock3),

    [gen_tcp:close(S) || S <- Socks],
    gen_tcp:close(LSock),

    io:format("~s~n  iolist_to_binary: ~.1f ns/op~n  iolist:           ~.1f ns/op~n  bin-append:       ~.1f ns/op~n",
              [Label, T0 * 1000 / N, T1 * 1000 / N, T2 * 1000 / N]).

bench_send_data(N, Data, Sock) ->
    Sender = spawn_link(fun() -> tcp_sender(N, Sock) end),
    T0 = erlang:monotonic_time(microsecond),
    repeat(fun() -> Sender ! {send, Data} end, N),
    Sender ! {done, self()},
    receive ok -> ok end,
    T1 = erlang:monotonic_time(microsecond),
    T1 - T0.

%% Build the iolist that command_to_bin would produce before iolist_to_binary
make_iolist([E|_] = Cmds) when is_list(E) ->
    [make_iolist_single(C) || C <- Cmds];
make_iolist(Cmd) ->
    make_iolist_single(Cmd).

make_iolist_single(RawCmd) ->
    Len = integer_to_list(length(RawCmd)),
    Elements = [[$$, integer_to_list(size(Bin)), $\r, $\n, Bin, $\r, $\n] || Bin <- RawCmd],
    [$*, Len, $\r, $\n, Elements].

%% Build binary by appending (no iolist intermediate)
make_bin_append([E|_] = Cmds) when is_list(E) ->
    [make_bin_append_single(C) || C <- Cmds];
make_bin_append(Cmd) ->
    make_bin_append_single(Cmd).

make_bin_append_single(RawCmd) ->
    Len = integer_to_binary(length(RawCmd)),
    Header = <<"*", Len/binary, "\r\n">>,
    lists:foldl(fun(Bin, Acc) ->
                        Size = integer_to_binary(byte_size(Bin)),
                        <<Acc/binary, $$, Size/binary, "\r\n", Bin/binary, "\r\n">>
                end, Header, RawCmd).

%% Helpers
sink(0) ->
    receive {done, Pid} -> Pid ! ok end;
sink(N) ->
    receive {cmd, _} -> sink(N - 1) end.

tcp_sender(0, _Sock) ->
    receive {done, Pid} -> Pid ! ok end;
tcp_sender(N, Sock) ->
    receive {send, Data} ->
            gen_tcp:send(Sock, Data),
            tcp_sender(N - 1, Sock)
    end.

drain(Sock) ->
    case gen_tcp:recv(Sock, 0, 100) of
        {ok, _} -> drain(Sock);
        {error, _} -> ok
    end.

repeat(Fun, N) when N > 0 -> Fun(), repeat(Fun, N - 1);
repeat(_, 0) -> ok.
Details

@WilliamVoong
Copy link
Copy Markdown
Collaborator

WilliamVoong commented Mar 20, 2026

I took a quick look and overall this looks great, very clever to convert the list of commands to a binary for cheaper copying between processes.

I will review it in detail next week.

@zuiderkwast
Copy link
Copy Markdown
Collaborator Author

very clever to convert the list of commands to a binary for cheaper copying between processes.

It was your idea to do this. 😄 (although I took it further)

Copy link
Copy Markdown
Collaborator

@WilliamVoong WilliamVoong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have reviewed the code and looks good to me, and I appreciate the good benchmarks that you provided along with the PR.

I have added one comment to the to the review. Please take a look.

@zuiderkwast zuiderkwast merged commit 8ea5293 into Ericsson:main Apr 1, 2026
9 of 11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants