Skip to content

Conversation

@martinsumner
Copy link
Contributor

Refactor HTTP GET/PUT API to try and improve efficiency.

Using the general_api _perf test, profiling a node handling all HTTP requests, prior to the change the profile was like:

FUNCTION                                     CALLS        %    TIME  [uS / CALLS]
--------                                     -----  -------    ----  [----------]
webmachine_resource:do/3                      3396     0.60     825  [      0.24]
string:strip_right/2                         10898     0.62     845  [      0.08]
lists:keyfind/3                              14311     0.63     855  [      0.06]
webmachine_resource:resource_call/3           2384     0.64     875  [      0.37]
gb_trees:lookup_1/2                          16905     0.71     974  [      0.06]
string:tokens_single_2/4                     24313     0.75    1022  [      0.04]
zlib:deflate_nif/4                             103     0.77    1050  [     10.19]
erlang:put/2                                 18841     0.84    1154  [      0.06]
webmachine_dispatcher:try_host_binding1/6     5127     0.88    1201  [      0.23]
lists:reverse/2                              14809     0.88    1206  [      0.08]
gen_statem:loop_receive/3                       86     0.90    1233  [     14.34]
erlang:binary_to_term/1                        340     0.98    1338  [      3.94]
gb_trees:insert_1/4                          11281     0.99    1357  [      0.12]
gen_fsm:loop/8                                 674     1.06    1450  [      2.15]
gen:do_call/4                                  645     1.10    1498  [      2.32]
lists:reverse/1                              21250     1.11    1520  [      0.07]
exometer_probe:loop/1                          768     1.16    1584  [      2.06]
erts_internal:port_command/3                   267     1.17    1603  [      6.00]
erts_internal:port_control/3                  1689     1.18    1618  [      0.96]
mochiweb_http:headers/6                        909     1.82    2492  [      2.74]
ets:lookup/2                                  2582     2.53    3455  [      1.34]
string:to_lower_char/1                      111820     3.42    4671  [      0.04]
gen_server:loop/7                              824     6.65    9086  [     11.03]
mochiweb_http:request/3                        125     7.60   10390  [     83.12]
string:'-to_lower/1-lc$^0/1-0-'/1           121272     9.90   13534  [      0.11]
-----------------------------------------  -------  -------  ------  [----------]
Total:                                     1040513  100.00%  136652  [      0.13]

FUNCTION                                     CALLS        %    TIME  [uS / CALLS]
--------                                     -----  -------    ----  [----------]
lists:keyfind/3                              13819     0.62     811  [      0.06]
string:strip_right/2                         10657     0.63     831  [      0.08]
prim_file:read_nif/2                            87     0.70     921  [     10.59]
zlib:deflate_nif/4                              99     0.73     951  [      9.61]
gb_trees:lookup_1/2                          16853     0.73     956  [      0.06]
string:tokens_single_2/4                     23795     0.76     992  [      0.04]
erlang:put/2                                 18265     0.83    1082  [      0.06]
webmachine_dispatcher:try_host_binding1/6     5002     0.88    1149  [      0.23]
gen_statem:loop_receive/3                       93     0.93    1225  [     13.17]
gb_trees:insert_1/4                          11014     0.97    1269  [      0.12]
lists:reverse/2                              14378     0.97    1270  [      0.09]
exometer_probe:loop/1                          751     1.03    1352  [      1.80]
gen:do_call/4                                  628     1.05    1371  [      2.18]
erlang:binary_to_term/1                        364     1.08    1421  [      3.90]
lists:reverse/1                              20655     1.13    1479  [      0.07]
erts_internal:port_command/3                   244     1.21    1590  [      6.52]
erts_internal:port_control/3                  1683     1.28    1676  [      1.00]
gen_fsm:loop/8                                 605     1.37    1801  [      2.98]
mochiweb_http:headers/6                        919     2.19    2878  [      3.13]
ets:lookup/2                                  2430     2.50    3285  [      1.35]
string:to_lower_char/1                      110384     3.51    4599  [      0.04]
gen_server:loop/7                              776     6.81    8930  [     11.51]
mochiweb_http:request/3                        122     8.51   11156  [     91.44]
string:'-to_lower/1-lc$^0/1-0-'/1           119691    10.15   13315  [      0.11]
-----------------------------------------  -------  -------  ------  [----------]
Total:                                     1007341  100.00%  131147  [      0.13]

After this refactor:

FUNCTION                                    CALLS        %    TIME  [uS / CALLS]
--------                                    -----  -------    ----  [----------]
webmachine_resource:resource_call/3          2317     0.65     821  [      0.35]
lists:keyfind/3                             13846     0.68     866  [      0.06]
webmachine_resource:do/3                     3293     0.79    1005  [      0.31]
erlang:put/2                                18248     0.84    1068  [      0.06]
erlang:binary_to_term/1                       305     0.85    1072  [      3.51]
gb_trees:insert_1/4                         11332     0.85    1081  [      0.10]
zlib:deflate_nif/4                            103     0.89    1131  [     10.98]
exometer_probe:loop/1                         738     0.89    1131  [      1.53]
webmachine_dispatcher:try_host_binding1/6    5002     0.90    1140  [      0.23]
gen_statem:loop_receive/3                      78     0.93    1175  [     15.06]
gen:do_call/4                                 598     0.98    1241  [      2.08]
string:trim_t/3                             10632     1.00    1262  [      0.12]
lists:reverse/2                             15051     1.03    1298  [      0.09]
erts_internal:port_command/3                  244     1.20    1524  [      6.25]
lists:reverse/1                             22539     1.24    1564  [      0.07]
string:lexeme_pick/3                        17830     1.26    1598  [      0.09]
gen_fsm:loop/8                                621     1.27    1610  [      2.59]
erts_internal:port_control/3                 1660     1.29    1627  [      0.98]
lists:member/2                              32688     1.30    1644  [      0.05]
string:search_cp/1                          19848     1.43    1807  [      0.09]
mochiweb_http:headers/6                       898     2.07    2616  [      2.91]
ets:lookup/2                                 2434     2.67    3379  [      1.39]
string:lowercase_list/2                     76913     3.52    4462  [      0.06]
gen_server:loop/7                             755     6.77    8574  [     11.36]
mochiweb_http:request/3                       122     7.57    9587  [     78.58]
-----------------------------------------  ------  -------  ------  [----------]
Total:                                     937636  100.00%  126603  [      0.14]

FUNCTION                                    CALLS        %    TIME  [uS / CALLS]
--------                                    -----  -------    ----  [----------]
webmachine_resource:do/3                     3090     0.70     838  [      0.27]
erlang:put/2                                17130     0.81     977  [      0.06]
gb_trees:insert_1/4                         10819     0.85    1026  [      0.09]
webmachine_dispatcher:try_host_binding1/6    4633     0.88    1057  [      0.23]
exometer_probe:loop/1                         672     0.94    1135  [      1.69]
erlang:binary_to_term/1                       316     0.95    1145  [      3.62]
string:trim_t/3                             10040     0.98    1184  [      0.12]
gen:do_call/4                                 559     0.99    1187  [      2.12]
lists:reverse/2                             14119     1.01    1214  [      0.09]
gen_fsm:loop/8                                542     1.02    1224  [      2.26]
erts_internal:port_command/3                  229     1.08    1303  [      5.69]
erts_internal:port_control/3                 1553     1.20    1442  [      0.93]
lists:reverse/1                             21102     1.22    1469  [      0.07]
string:lexeme_pick/3                        16825     1.26    1521  [      0.09]
mochiweb_http:headers/6                       844     1.30    1563  [      1.85]
lists:member/2                              30839     1.30    1569  [      0.05]
string:search_cp/1                          18720     1.41    1699  [      0.09]
gen_statem:loop_receive/3                      76     1.51    1824  [     24.00]
prim_file:read_nif/2                           75     1.64    1980  [     26.40]
prim_file:pwrite_nif/3                          5     2.01    2427  [    485.40]
ets:lookup/2                                 2229     2.51    3027  [      1.36]
string:lowercase_list/2                     72807     3.40    4090  [      0.06]
gen_server:loop/7                             686     5.91    7118  [     10.38]
mochiweb_http:request/3                       114    11.25   13547  [    118.83]
-----------------------------------------  ------  -------  ------  [----------]
[info] Total:                                     873808  100.00%  120463  [      0.14]

In theory the refactoring as removed the biggest cause of CPU use in the HTTP API - lower-casing strings. This is supported in the profile.

the refactoring reduces the number of repeated string lowerings, and across the board stops using deprecated string functions

Use updated version of mochiweb/webmachine with deprecated string functions removed.

As part of this generate a map of request headers, to handle request headers (without repeated re-normalisation) within the riak_kv_wm_object file.

Use of regular expressions is now compiled, and not re-compiled for every request (a cached version is accessed via persistent_term instead). This should be more efficient again once the OTP updates its regex library.
@martinsumner martinsumner changed the title Nhse o34 nhskv.i30 getputapi WIP: Refactor HTTP API Feb 14, 2025
@martinsumner
Copy link
Contributor Author

martinsumner commented Feb 14, 2025

There is a general question of where to go with this. There are still some obvious potential avenues for further improvements, and questions:

  • The ets:lookup/2 calls (presumably within webmachine) seem surprisingly expensive;
  • Why is the cost per call to mochiweb_http:request/3 so high? When testing the PB API there doesn't seem to be the same overhead (with receiving the request).
  • Should binaries be used instead of strings?
  • Potential gain from replacing gb_trees with maps.
  • What about future needs (e.g. future HTTP versions), will mochiweb be updated?

Perhaps in many cases, if we were to continue refactoring and improving, we would simply be duplicating work already done within Cowboy. The implementation of webmachine is now very similar to cowboy_rest (is cowboy's implementation a copy of webmachine?), but are there advantages of continuing to use webmachine and mochiweb going forward?

Considering Cowboy is also related to the PB API, where there is the desire to potentially remove the gen_nb_server - and the common approach used on other projects is to adopt ranch instead.

{riak_pipe, {git, "https://github.com/OpenRiak/riak_pipe.git", {branch, "openriak-3.4"}}},
{riak_dt, {git, "https://github.com/OpenRiak/riak_dt.git", {branch, "openriak-3.2"}}},
{riak_api, {git, "https://github.com/OpenRiak/riak_api.git", {branch, "openriak-3.4"}}},
{riak_api, {git, "https://github.com/OpenRiak/riak_api.git", {branch, "nhse-o34-orkv.i30-string"}}},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to do this here? I figured that rebar.config would be updated at then end when all dependency branches are ready.

@martinsumner martinsumner marked this pull request as draft September 8, 2025 13:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants