Swap call to repo to use enhanced repocache#240
Conversation
|
I have swapped obvious calls from repo to repocache. To proceed further, I am going to need to cache the filenames and filesizes. @jonc125 is this your understanding? and shall I proceed to do that. I guess I will need new entities for these. |
Codecov Report
@@ Coverage Diff @@
## master #240 +/- ##
========================================
+ Coverage 94.6% 94.7% +<.1%
========================================
Files 65 65
Lines 2638 2656 +18
Branches 276 278 +2
========================================
+ Hits 2498 2516 +18
- Misses 101 102 +1
+ Partials 39 38 -1
Continue to review full report at Codecov.
|
|
I'll try to take a look at this first thing tomorrow. Personally I don't think the gain from caching filenames is sufficient to be worth it at this stage. |
jonc125
left a comment
There was a problem hiding this comment.
This is good for the bits it touches, but there's more we can do just with the repocache additions we have so far. Some examples in entities/views.py, there may be more:
EntityVersionMixin.get_commit: in most (but not quite all) places where this is called, aCachedEntityVersioninstance would suffice. So perhaps add a newget_versionmethod that returns the cache object, and review calls to see which can change over.- Similarly
EntityVersionMixin.get_context_datacan use the cache object for theversionfield most (hopefully all!) of the time. This is a little trickier to test, as templates don't give you an error on failed lookups by default. So might need to use something like https://stackoverflow.com/a/7854404 to help catch instances! I suspectauthorlookups andnumfilesshould be the only ones that differ betweenCommitandCachedEntityVersionso there shouldn't be many issues. EntityVersionListViewandEntityRunExperimentViewhopefully don't need to userepo.get_commit
|
I'm just testing changes to EntityVersionMixin. I will commit these shortly. |
|
I hope this is now ready for re-review. In some of the tests I decided to stick with |
|
I have had several attempts (see check ins) to add the |
jonc125
left a comment
There was a problem hiding this comment.
This is looking promising, though I've not tried running it yet, just added some comments from reading through the diff.
|
Some tests are failing now with missing filenames so looks like I have been over zealous with the swapping - on it now,. |
|
Rather than back out most of the changes as you've done in f5c08eb, you can change the references in templates. The error looks like "Undefined template variable 'commit.filenames' in 'entities/entity_versions.html'". We didn't cache filenames, but we did I think cache numfiles, and most occurrences of filenames in templates pipe it through len, so the number of files is what they actually need, not the names. In other words, change |
|
ah yes.. sorry I must having been having a few funny mins. makes sense I will do that now. |
|
I'll run it here in a bit and check whether I see the same thing. |
|
thanks it works ok on linux. I'm just seeing if I can spot anything obvious on windows |
|
I had to run It might just be that your browser needs to re-fetch the latest JS code into its cache? Spotted one task remaining: |
|
I've had a look at replacing the use of tag_dict as suggested. I have come up with the following: unfortunately. It seems that sometimes the entity passed is a model but the commit was for a protocol. I have gone back to master and this was still the case. It didn't matter because repo.tag_dict presumably has both type of commits. I cant, at the moment. understand why the passed entity type is wrong. One approach I tried was to query all cached objects with CACHED_VERSION_TYPE_MAP but that seems overkill. I guess the other option is to maintain our own repocache.tag_dict I wondered if you had any suggestions please? In the meantime I will try and understand why models are being passed in by the templates for a protocol commit. ...... I may have been overthinking this, I am now looking at just getting tags from the commit object. |
|
That's indeed the replacement code I'd expect. The commit not belonging to the entity sounds like a bug! In such circumstances the original code will just return the commit.sha - the Are there particular tests that hit this bug? Or just in general browsing? |
|
Possibly the culprit is diff --git i/weblab/templates/entities/entity_runexperiments.html w/weblab/templates/entities/entity_runexperiments.html
index 223ae0d..f1f5505 100644
--- i/weblab/templates/entities/entity_runexperiments.html
+++ w/weblab/templates/entities/entity_runexperiments.html
@@ -46,7 +46,7 @@
name="model_protocol_list[]"/>
{% endif %}
<strong>
- <a class="entityversionlink" href="{% entity_version_url 'version' entity entity_version.commit %}">
+ <a class="entityversionlink" href="{% entity_version_url 'version' entity_object entity_version.commit %}">
{% include "./includes/version_name.html" with tags=entity_version.tags version=entity_version.commit only %}
</a>
</strong>
@@ -84,7 +84,7 @@
name="model_protocol_list[]"/>
{% endif %}
<strong>
- <a class="entityversionlink" href="{% entity_version_url 'version' entity entity_version.commit %}">
+ <a class="entityversionlink" href="{% entity_version_url 'version' entity_object entity_version.commit %}">
{% include "./includes/version_name.html" with tags=entity_version.tags version=entity_version.commit only %}
</a>
</strong>In which case, mea culpa! |
| @@ -134,10 +134,14 @@ def _url_friendly_label(entity, commit): | |||
| :param entity: Entity the commit belongs to | |||
| :param commit: `git.Commit` object | |||
There was a problem hiding this comment.
The docstring needs updating: we now need a CachedEntityVersion object here.
|
I've done a few experiments to see how much this PR improves the open file handles issue. I ran the entities test_views using the following patch to measure how many files are left open by the end. It only works on Mac or Linux though. diff --git i/weblab/entities/tests/test_views.py w/weblab/entities/tests/test_views.py
index 8eefff7..c9ca969 100644
--- i/weblab/entities/tests/test_views.py
+++ w/weblab/entities/tests/test_views.py
@@ -23,6 +23,25 @@ from repocache.models import ProtocolInterface
from repocache.populate import populate_entity_cache
+def get_open_fds():
+ '''
+ return the number of open file descriptors for current process
+
+ .. warning: will only work on UNIX-like os-es.
+ '''
+ import subprocess
+ import os
+
+ pid = os.getpid()
+ procs = subprocess.check_output(
+ ["lsof", '-w', '-Ff', "-p", str(pid)]
+ ).splitlines()
+ print(procs)
+
+ nprocs = len([s for s in procs if s and s.decode()[0] == 'f' and s.decode()[1:].isdigit()])
+ return nprocs
+
+
@pytest.fixture
def analysis_task(protocol_with_version):
"""A single AnalysisTask instance with associated Protocol version & repocache set up."""
@@ -2668,3 +2687,8 @@ class TestEntityRunExperiment:
assert planned_experiment.protocol == protocol
assert planned_experiment.protocol_version == proto_commit1.sha
assert (planned_experiment.model, planned_experiment.model_version) in expected_model_versions
+
+
+def test_zzz_open_fds():
+ num_open_files = get_open_fds()
+ assert num_open_files < 3
So looking promising! |
jonc125
left a comment
There was a problem hiding this comment.
I think this is now ready to merge!
Swap call to repo to use enhanced repocache

Fixes #191
Changes to use enhanced repocache were possible.