Skip to content

Do not re-run components unnecessarily#469

Closed
ANogin wants to merge 15 commits into
redballoonsecurity:masterfrom
ANogin:bugfix/no_identifier_rerun
Closed

Do not re-run components unnecessarily#469
ANogin wants to merge 15 commits into
redballoonsecurity:masterfrom
ANogin:bugfix/no_identifier_rerun

Conversation

@ANogin
Copy link
Copy Markdown
Contributor

@ANogin ANogin commented Jun 1, 2024

One sentence summary of this PR (This should go in the CHANGELOG!)
Avoid rerunning components unnecessarily; eliminate an already unlikely possibility that a component that should be rerun (because of difference in configuration) will not be rerun.

Link to Related Issue(s)
N/A

Please describe the changes in your request.
Previously, when executing a set of components (e.g. identifiers) on a resource, the job service will check whether the tags that the previous run has added means that more components should be executed, and then will run the full set again. This changes it so that only the additional components are executed, but those that were already executed once are not executed again. However if a component is run with a non-null config, it is always run, without regard for caching.

I observed speedups of up to 10% or more on some messy identify/unpack code.

Anyone you think should look at this, specifically?
@whyitfor

Aleksey Nogin added 9 commits May 31, 2024 23:35
One way to look at the issue is that the task cache was emptied too eagerly.
This changes it so that the task cache is only emptied when the job
service is idle.
Without this fix the custom config could end up being ignored - this bug
existed before, but the earlier changes here made it way easier to hit
it.
Rather than manually querying the resource for whether the component was
run (which would not return a hit when the component is still running),
hook directly into Resource.remove_component to remove stale cache
entries
This is needed because image_seq changes
@ANogin ANogin changed the title Do not re-run identifiers when new candidates are added to the list Do not re-run components unnecessarily Jun 1, 2024
Aleksey Nogin added 4 commits June 1, 2024 08:26
@ANogin
Copy link
Copy Markdown
Contributor Author

ANogin commented Jun 2, 2024

I did a timing test of a gather of 2x (create_root_resource_from_file + unpack_recursively(do_not_unpack=[Elf]) + auto_run_recursively(all_identifiers=True)) on TinyCore iso 14 and 15 respectively. Here are the results

bugfix/no_identifier_rerun    98.44s wall   65.83s CPU   142.9ms async select   32.47s blocked
bugfix/no_identifier_rerun    97.17s wall   64.74s CPU   146.2ms async select   32.29s blocked
bugfix/no_identifier_rerun   101.92s wall   68.11s CPU   151.5ms async select   33.66s blocked
bugfix/no_identifier_rerun    98.60s wall   66.23s CPU   148.5ms async select   32.22s blocked
bugfix/no_identifier_rerun    99.96s wall   66.26s CPU   134.2ms async select   33.56s blocked

master f3e583c               108.79s wall   76.34s CPU   146.6ms async select   32.31s blocked
master f3e583c               111.36s wall   77.90s CPU   151.1ms async select   33.31s blocked
master f3e583c               109.05s wall   76.93s CPU   146.9ms async select   31.98s blocked
master f3e583c               113.03s wall   78.98s CPU   144.8ms async select   33.91s blocked
master f3e583c               112.77s wall   79.39s CPU   130.9ms async select   33.25s blocked

where I used the approach from https://gist.github.com/vxgmichel/620eb3a02d97d3da9dacdc508a5d5321 to break the non-CPU time into “select” vs “blocked IO”.

@whyitfor whyitfor self-requested a review June 6, 2024 02:39
Copy link
Copy Markdown
Contributor

@whyitfor whyitfor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ANogin:

  1. Can you fix merge conflicts given recent pushes to master
  2. With #472 now on main, is the performance speedup still 10%?

@ANogin
Copy link
Copy Markdown
Contributor Author

ANogin commented Jun 6, 2024

  1. Can you fix merge conflicts given recent pushes to master

Done

  1. With Use GzipFile python unpacker for speed, fall back on pigz if needed #472 now on main, is the performance speedup still 10%?

No, it's ~35-40% now.

master                       27.35s wall   26.99s CPU   124.0ms async select   244.7ms blocked
master                       26.97s wall   26.55s CPU   125.3ms async select   293.7ms blocked
master                       26.90s wall   26.52s CPU   120.5ms async select   265.8ms blocked
master                       27.68s wall   27.29s CPU   123.3ms async select   267.2ms blocked
master                       26.91s wall   26.51s CPU   131.1ms async select   267.9ms blocked

bugfix/no_identifier_rerun   19.84s wall   19.44s CPU   119.7ms async select   280.6ms blocked
bugfix/no_identifier_rerun   19.44s wall   18.99s CPU   125.5ms async select   327.8ms blocked
bugfix/no_identifier_rerun   19.43s wall   18.99s CPU   124.6ms async select   313.0ms blocked
bugfix/no_identifier_rerun   19.85s wall   19.41s CPU   125.1ms async select   317.7ms blocked
bugfix/no_identifier_rerun   19.66s wall   19.19s CPU   132.9ms async select   338.6ms blocked

@ANogin
Copy link
Copy Markdown
Contributor Author

ANogin commented Aug 21, 2024

@whyitfor convinced me this is not the way to go.

@ANogin ANogin closed this Aug 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants