Conversation
PeterBaker0
left a comment
There was a problem hiding this comment.
As @arlowhite mentioned we can't embed this into docker layer since it depends on runtime data processing of mounted data, however it will still, in the context of a shared filesystem (e.g. EFS), allow the work to only be done once, rather than every time a container restarts/starts.
My only recommendation would be to, at some stage, ensure there is a unique hash such as "input data + code version + ..." associated with the file cache so that we don't have to go cleanup stale data from the cache when processes change.
Thanks for doing this.
I'll put this in as a separate issue for later enhancement. Cheers @PeterBaker0 |
Please note this PR modifies the expected config settings, notably renaming
CACHE_DIRto something more appropriate:With JLD2 (HDF5-based Julia data store)
Using built-in Serialization/Deserialization
Obviously, I went with the direct Serialization/Deserialization method. Size of disk is much smaller too (2.7 GB vs 7.8 GB, though I'm not sure if any compression was applied with JLD2)
A further alternative is to create the cache at compile time, but requires more digging.
@PeterBaker0 @arlowhite could one of you test please?
I recently updated to Windows 11 and by default Julia is now blocked by the firewall (needs sys admin to change configs) so I can't test the REST API at the moment.