Skip to content

Speed up on#48

Merged
gipert merged 17 commits into
gipert:mainfrom
iguinn:speed
May 6, 2026
Merged

Speed up on#48
gipert merged 17 commits into
gipert:mainfrom
iguinn:speed

Conversation

@iguinn
Copy link
Copy Markdown
Contributor

@iguinn iguinn commented Feb 16, 2026

Changes to TextDB and Props to speed up repeated calls to on. After these changes, I was able to get it so that when calling on for each cycle, the longest time is taken by yaml in reading the data.

Note these break some features (that I'm not sure are important). If these features are important, then an alternative to avoid processing time associated with calling on frequently when no new data is being read is in this PR: #46

  1. Do not use deepcopy in Props.add_to. This shouldn't have unintended side-effects because the output nested dict structure is being constructed anew rather than copying from the individual dicts; unintended consequences could only arise if one were to change one of the values inside of the dict.
  2. on is using rglob to find paths; this replaces it with just checking if the path is valid. This means that wildcards will no longer work, and paths will not be found nested in other paths.
  3. Do not call resolve on relative paths in TextDB.getitem. I think this could break using environment variables as a part of a relative path.
  4. Do not call subst_vars in on. Subst_vars is called when initially reading in the AttrsDicts, and so calling it again should be redundant.

@iguinn iguinn mentioned this pull request Feb 16, 2026
@codecov
Copy link
Copy Markdown

codecov Bot commented Feb 16, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 92.33%. Comparing base (5b17622) to head (dddf6ad).
⚠️ Report is 2 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main      #48      +/-   ##
==========================================
+ Coverage   91.72%   92.33%   +0.60%     
==========================================
  Files           6        6              
  Lines         568      587      +19     
==========================================
+ Hits          521      542      +21     
+ Misses         47       45       -2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@gipert gipert self-requested a review March 11, 2026 08:54
Copy link
Copy Markdown
Owner

@gipert gipert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we could merge this, I don't think it would break anything.

But it's important to document these limitations in the docstring. Could you do that please?

@iguinn
Copy link
Copy Markdown
Contributor Author

iguinn commented Apr 28, 2026

I've updated the docstrings with limitations. I didn't add a note about the rglob one, since I don't think the definition here https://legend-exp.github.io/legend-data-format-specs/dev/metadata/#Specifying-metadata-validity-in-time-(and-system) implies an expectation that wildcards are allowed. If this PR is accepted, #46 can be closed

@ggmarshall
Copy link
Copy Markdown
Contributor

Removing the deep copies worries me, I have a recollection of this being needed

@ggmarshall
Copy link
Copy Markdown
Contributor

From memory without the deepcopy repeated calls to the database were not giving the expected results (not sure how well this is covered by our unit tests)

@ggmarshall
Copy link
Copy Markdown
Contributor

ggmarshall commented Apr 28, 2026

Ok I found the commit : legend-exp/pylegendmeta@2e6e900 , and the reason was when calculating exposures we found the psd_statuses as they were nested were being changed by the dictionaries being mutable, here was our test case (old code now)

from legendmeta import LegendMetadata

lmeta = LegendMetadata("./legend-metadata")
chmap = lmeta.channelmap()

exposure = 0
exposure_psd = 0

for period, runs in lmeta.dataprod.config.analysis_runs.items():
    for run in runs:
        if "phy" not in lmeta.dataprod.runinfo[period][run]:
            continue

        channels = lmeta.channelmap(on=lmeta.dataprod.runinfo[period][run].phy.start_key)
        for hpge, info in channels.group("system").geds.map("name").items():

            if info.analysis.usability == "on":
                this_exposure = (
                    lmeta.dataprod.runinfo[period][run].phy.livetime_in_s / (60 * 60 * 24 * 365)
                    * info.production.mass_in_g / 1000
                )
                exposure += this_exposure

            pinfo = info.analysis.psd
            if "is_bb_like" in pinfo and pinfo.is_bb_like != "missing":
                if all([pinfo.status[psd.strip()] == "valid" for psd in pinfo.is_bb_like.split("&")]):
                    exposure_psd += this_exposure


print(exposure, "kg yr")
print(exposure_psd, "kg yr")

@gipert
Copy link
Copy Markdown
Owner

gipert commented Apr 29, 2026

yes let's try to reconstruct this problem, i also remember it now. we should try to cover it in the unit tests

@iguinn
Copy link
Copy Markdown
Contributor Author

iguinn commented Apr 29, 2026

Deep copying has a huge impact on performance when you are repeatedly reading the same thing, this has an order of magnitude effect:
image

The above was calling channelmap and on, timing the first read (that's the first time), timeit-ing repeated reads with the same timestamp (which is equivalent to looping over cycles and using each cycle's timestamp), and then profiling. The timeit is highlighted because this is the issue being solved with this. The profile shows that the deepcopy is indeed responsible for the slow down.

@ggmarshall
Copy link
Copy Markdown
Contributor

I agree it would be good to remove it but need a way to make sure that we are not going to run in mutable dictionary issues again

@ggmarshall
Copy link
Copy Markdown
Contributor

Im also surprised hit pars is so much faster than channelmap despite it being a larger dictionary with a more complex structure? We should look into why

@ggmarshall
Copy link
Copy Markdown
Contributor

Claude suggests we use a lightweight copy:

@staticmethod
def _copy(v):
    if isinstance(v, dict):
        return type(v)({k: Props._copy(vv) for k, vv in v.items()})
    if isinstance(v, list):
        return [Props._copy(vv) for vv in v]
    return v  # str, int, float, bool, None — immutable

Props._copy(b[key])

@ggmarshall
Copy link
Copy Markdown
Contributor

ggmarshall commented Apr 29, 2026

Actually I like your solution more

@iguinn
Copy link
Copy Markdown
Contributor Author

iguinn commented Apr 29, 2026

Added a test to catch this problem.

So, I tried an equivalent solution before the one I just pushed, and it helps but not much.

A solution that works well is to minimize the number of copies. The logic is to do a shallow-copy of a whole branch if nothing in the branch is getting updated. If something new is added, we also can do a shallow-copy. If we are doing an update of a branch, then we will create a new dict to avoid update side-effects, and then repeat the previous logic to minimize how deep we have to recurse and shallow-copy whenever that is acceptable. This performs quite well, and passes the test:
image

Note that I think this will create some potentially unpredictable/confusing results if you try modifying the result of on. Basically, it will update whatever underlying dict was the source of the information, unless it recursed all the way to the value itself, in which case it won't replace even in the underlying dict. I don't think we expect people to modify the results of on, but it's possible we want a way to enforce that on should be read-only somehow (would be much easier in C++ with const...).

@gipert
Copy link
Copy Markdown
Owner

gipert commented Apr 29, 2026

can we raise an exception if AttrsDict.__setattr__() is called?

@iguinn
Copy link
Copy Markdown
Contributor Author

iguinn commented Apr 30, 2026

Im also surprised hit pars is so much faster than channelmap despite it being a larger dictionary with a more complex structure? We should look into why

This is just because it's doing a shallow copy of the full dictionary without having to walk through any of it, while channelmap has to update a few things (note that the first read takes 3 s, much longer than channelmap).

Otherwise, I added AttrsDict_RO as a read only version of AttrsDict, by raising an exception when __setattr__ and __setitem__ are called, and returning a AttrsDict_RO instead of AttrsDict when __getattr__ and __getitem__ are called. Also deepcopy on an AttrsDict_RO returns an AttrsDict since this removes the risk of side-effects. It's quite easy to work around these restrictions, python being python, but it's harder to do so accidentally. on is updated to return a read only dict.

This change will break channelmap in pylegendmeta; this is because it calls on on multiple databases and then combines them. The fix is to deepcopy the first one we read and then combine the others into it. This performs ~3x faster than the old (since it's not doing quite as much deep copying), but the added deep copy is ~2x slower than my test above (but the fast version I think has some side effects that we don't want). I'm not sure if there are other situations where people are trying to modify their AttrsDicts from on, but if they are this would also break.

@iguinn
Copy link
Copy Markdown
Contributor Author

iguinn commented May 1, 2026

Hmm well the AttrsDict_RO solution has some problems. Pickling in particular is going to be hard (impossible?) to make work, since the class derives from dict. I will investigate other solutions to this...

@iguinn
Copy link
Copy Markdown
Contributor Author

iguinn commented May 1, 2026

Ok attempt 2 at readonly: now I've added a __readonly__ attribute to AttrsDict. This will raise an Error if you call __setitem__. When you set __readonly__ it will propagate to any child AttrsDicts; this takes some time, but it's not terrible; it's barely noticeable for channelmap, while for the hit pars, it goes from 8 ms to 25 ms, but given how fast it was before this isn't a problem. It will also emit a warning if you try to change __readonly__ from True to False. I have also made it so that __setstate__ (which is used for deepcopy and pickling) will set __readonly__ to False regardless of what the original object was since the copy won't have unintended consequences. on will set __readonly__ to True.

I have also added more tests than my first attempt. Note that if not explicitly implmented, deepcopy uses __getstate__ and __setstate__ if not explicitly defined, so by testing deep copy it is implicitly testing pickle as well.

@gipert
Copy link
Copy Markdown
Owner

gipert commented May 4, 2026

so are we sure now that this pr does not introduce any regression? (see snipped posted by george)

@iguinn
Copy link
Copy Markdown
Contributor Author

iguinn commented May 6, 2026

As sure as we are in the completeness of the tests. For what it's worth, I reran the snippet from George and it worked correctly. I've also run a few tests with the querying functions which use this quite a bit under the hood and haven't seen any problems yet.

@gipert gipert merged commit d5f8955 into gipert:main May 6, 2026
14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants