Skip to content

feat: benchmark network traffic and database size#837

Merged
brprice merged 8 commits intomainfrom
brprice/benchmarks
Feb 27, 2023
Merged

feat: benchmark network traffic and database size#837
brprice merged 8 commits intomainfrom
brprice/benchmarks

Conversation

@brprice
Copy link
Copy Markdown
Contributor

@brprice brprice commented Jan 19, 2023

This uses the nixos-test framework to run each component (database, primer, client) in a different vm and generates traffic by replaying sessions. We measure the network traffic and database size.

@brprice brprice force-pushed the brprice/benchmarks branch from 83af348 to 150589b Compare January 23, 2023 11:28
@brprice brprice force-pushed the brprice/benchmarks branch from e73c285 to 6766b6b Compare January 31, 2023 12:27
@brprice brprice force-pushed the brprice/benchmarks branch from 6766b6b to 41b827a Compare February 6, 2023 14:42
@brprice brprice changed the base branch from main to brprice/benchmark-edits February 6, 2023 14:43
@brprice brprice changed the title Brprice/benchmarks feat: benchmark network traffic and database size Feb 6, 2023
@brprice
Copy link
Copy Markdown
Contributor Author

brprice commented Feb 6, 2023

This is pretty much done, except:
TODO:

  • document format of fixture
  • document how to extract fixture (I'll do these first two after feat: benchmark edit actions #863 is finished)
  • rebase to remove "TODO" in commit msg and code (note: there is also one in code in a commit which does not have "TODO" in its commit message!)
  • perhaps consider a more realistic fixture

One obvious extension is to support multiple fixtures. This should not be difficult, but I don't know enough about the nix setup to do this quickly. Perhaps leave this for future work?

@dhess: Could you look at the nix code with some suspicion? I think there are at least two bits that could probably be done better: using fromString a lot , and actually hooking up the benchmark into the flake (related to "generalise to have multiple fixtures", since due to my file layout I can't just use importFromDirectory). One higher-level question is where in the flake this should live -- currently I put it directly in benchmarks, gated by a x86_64-linux check, similar to the github-action benchmark.

@brprice brprice force-pushed the brprice/benchmarks branch from 41b827a to cb28cdb Compare February 6, 2023 14:52
@dhess
Copy link
Copy Markdown
Member

dhess commented Feb 6, 2023

I'll take a look, but FYI, I was able to refactor hacknix's NixOS tests so that I could use importFromDirectory by simply moving things around a bit, in case this helps: https://github.com/hackworthltd/hacknix/tree/main/tests

Note that there are subdirs full of test data in there, and so long as you're careful that the importFromDirectory dir only has .nix files with actual tests, it all works.

@brprice brprice force-pushed the brprice/benchmark-edits branch from 15829f3 to 84e5a17 Compare February 8, 2023 14:42
@brprice
Copy link
Copy Markdown
Contributor Author

brprice commented Feb 9, 2023

Thanks, I've managed to re-organise so I can use importFromDirectory -- probably best to hold of on looking at this until after I push that. EDIT: done

@brprice brprice force-pushed the brprice/benchmark-edits branch from 9be5b4c to 7558d8b Compare February 14, 2023 15:45
@brprice brprice force-pushed the brprice/benchmark-edits branch 4 times, most recently from 9bf13e3 to f535eac Compare February 22, 2023 16:49
Base automatically changed from brprice/benchmark-edits to main February 23, 2023 11:36
These are added as setup for adding some openapi endpoints to
primer-client.
This is more setup to add openapi endpoints to primer-client.
We add these as setup for a "replay fixture via the API" (as opposed to
our previous "replay edit fixture via 'runEditAppM').
We now emit `RequestStart` log lines, to aid in extracting
test/benchmark cases.

When looking at the generated logs, note that:
- log messages from multiple simultaneous interactions may be interleaved
- we log some internal high-level requests, so one API request can
  generate many log lines.
- each group is preceeded by a `RequestStart` line -- this shows where a
  new request was started, and the following messages will be various
  stages of handling it.
- we will have at most one `Edit` per request, but it will probably be
  after some `ApplyAction*` and `GetProgram` lines, which correspond to
  the actual API request made by your frontend and an internal
  implemetation detail.
- technically, there is no guarantee that the `RequestStart` line is
  actually consecutive with its generated logs, but it is normally the
  case for a single-user session.
'primer-replay' is a runner to replay a primer session via the API,
intended for benchmarking network traffic and database size.
Comment thread nixos-bench/fixtures/net-db-replay-1.nix Outdated
@brprice brprice marked this pull request as ready for review February 25, 2023 13:17
@brprice brprice requested a review from a team February 25, 2023 13:17
@brprice brprice force-pushed the brprice/benchmarks branch 2 times, most recently from 13b9c7e to 92dc51a Compare February 26, 2023 13:51
Comment thread flake.nix Outdated
Comment thread nixos-bench/fixtures/net-db-replay-1.nix
Copy link
Copy Markdown
Member

@dhess dhess left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's move the new benchmark script to the same default.nix file as the others, but otherwise, this is really good stuff, and will be invaluable data. Thanks!

This will be needed for a nixos-test based network traffic benchmark, so
we may as well use it instead of separate arguments for (e.g.) coreutils
and jq.
@dhess
Copy link
Copy Markdown
Member

dhess commented Feb 27, 2023

Here's a thought, though out of scope for this PR: how hard would it be to collect GHC RTS stats from these runs; e.g., number of GCs? Those might be really interesting, as these replays are as close as we can currently get to real-world loads.

@brprice brprice requested a review from dhess February 27, 2023 12:04
@brprice
Copy link
Copy Markdown
Contributor Author

brprice commented Feb 27, 2023

Some future work is to add a fourth vm, a proxy. This would add compression, and bring this benchmark closer to the real-world. (Having both uncompressed and compressed stats would be useful, though perhaps they can be done in one benchmark, just by measuring either side of the proxy somehow (the current setup won't do that, as we only get tx/rx stats per interface; maybe setting up the proxy to have two interfaces: one for client, one for backend would work)). @dhess has indicated he'll look into this

, writeText
, coreutils
, jq
, pkgs
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As an aside, I believe this is considered to be bad practice by the nixpkgs maintainers, as you will sometimes want to use a different pkgs for tests etc. than the one from which your specific tooling comes from (jq, coreutils, etc.), but I'm not bothered, so this is fine for our needs.

Comment thread flake.nix
Copy link
Copy Markdown
Member

@dhess dhess left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes are fine, though I would prefer not to have fixup! commits in our history.

@brprice
Copy link
Copy Markdown
Contributor Author

brprice commented Feb 27, 2023

Here's a thought, though out of scope for this PR: how hard would it be to collect GHC RTS stats from these runs; e.g., number of GCs? Those might be really interesting, as these replays are as close as we can currently get to real-world loads.

Actually, I think it may be fairly straightforward. I'll have a look

This uses the nixos-test framework, running a primer server, a
postgresql server and a client in sepearate VMs.
@dhess
Copy link
Copy Markdown
Member

dhess commented Feb 27, 2023

Some future work is to add a fourth vm, a proxy. This would add compression, and bring this benchmark closer to the real-world. (Having both uncompressed and compressed stats would be useful, though perhaps they can be done in one benchmark, just by measuring either side of the proxy somehow (the current setup won't do that, as we only get tx/rx stats per interface; maybe setting up the proxy to have two interfaces: one for client, one for backend would work)). @dhess has indicated he'll look into this

See #889

@brprice brprice added this pull request to the merge queue Feb 27, 2023
Merged via the queue into main with commit 9022c69 Feb 27, 2023
@brprice brprice deleted the brprice/benchmarks branch February 27, 2023 14:20
@brprice
Copy link
Copy Markdown
Contributor Author

brprice commented Feb 27, 2023

Here's a thought, though out of scope for this PR: how hard would it be to collect GHC RTS stats from these runs; e.g., number of GCs? Those might be really interesting, as these replays are as close as we can currently get to real-world loads.

Actually, I think it may be fairly straightforward. I'll have a look

See #890

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants