-
Notifications
You must be signed in to change notification settings - Fork 0
Cache path matchers #3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
+ removed restriction on non local tag queries on Cache writing.
+ This allows us to terminate the recursion when querying set names or the set membership.
- move tag functionality to SceneInterface and express in terms of sets. - added getRoot to scene interface to query the top level SceneInterface. - removed testWritingOnFlushedFiles as child SceneInterfaces always hold references to the root
johnhaddon
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey Don,
Gonna just put all my thoughts here rather than mix GitHub with email. I haven't looked at everything, in fact I've only looked at the SceneInterface and SceneCache changes and then stopped, because there seems to be more than enough to talk about with just those. I'm a bit under the weather, so I might be misreading things, but if I'm not, I'm already concerned about both the performance and the clarity of the code. If my cold-addled brain is worrying about nothing, please forgive me...
My approach has been to store tags as sets on the root scene interface and make the tag API concrete on the SceneInterface and implement it as querying sets on the root. This requires a pointer to the root SceneInterface from any SceneInterface. I liked the idea of new SceneInterfaces just supporting the set API and the tags would just work.
I found that idea appealing too at first, but I want to at least play devil's advocate. My concerns are :
- It doesn't really buy us anything. As we move to new formats, we'll be moving to sets anyway.
- It totally reimplements a feature that is still really important at Image Engine, and is important for performance. Maybe you have benchmarks to prove otherwise, but I find it hard to believe that simple queries like
hasTag()andreadTag()have the same performance as before. My understanding is that they're now forced to construct sets (maybe even entire sets?) whereas before they were definitely cheap.
My main problem is with the SceneInterface pointer to the root SceneInterface and ensuring this is const correct.
I'm hoping that's achievable by dropping m_root and just using SceneInterface::scene(), but presumably it's not? I do think we have bigger problems to address first though. I think it'd be beneficial for us to step back a bit to correct any misconceptions I might have and/or address my concerns about the SceneCache changes. I'm not sure I really understand how the tag->set conversion is happening, or the performance implications...
Cheers...
John
| static void registerCreator( const std::string &extension, IECore::IndexedIO::OpenMode modes, CreatorFn f ); | ||
|
|
||
| private: | ||
| SceneInterfacePtr m_root; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SceneInterface is meant to be an abstract class, so this feels wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SceneInterface is meant to be an abstract class, so this feels wrong.
Do you mean it's supposed to be an interface class with no storage and only pure virtual function signatures? If it did have this property what are the benefits?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean it's supposed to be an interface class with no storage and only pure virtual function signatures?
Yes.
If it did have this property what are the benefits?
For me, trying to understand the code, one huge benefit would be clarity. I find the various SceneInterfaces are already a bit mind boggling, since they often have multiple internal implementation classes each with their own relationships. Pushing those relationships out into the base class really doesn't help.
Might this "all children have ownership of the root" setup also preclude certain implementations from storing their children for fast child() queries? Because it would force a circular reference? I have a nasty feeling I've had to avoid this pattern in the past for those reasons. Baking in assumptions about how these subclasses must be implemented seems wrong, especially because it's only to provide short term compatibility for tags. It's basically turning a hack into a structural feature.
Finally I guess it's the fact that we've been heading in a nice direction where we don't even expose the headers for our derived classes, so the implementation is all nicely hidden and changeable without breaking major version. This kindof breaks that.
| /// Returns a const interface for querying the scene at the given path (full path). | ||
| virtual ConstSceneInterfacePtr scene( const Path &path, MissingBehaviour missingBehaviour = ThrowIfMissing ) const = 0; | ||
|
|
||
| /// Returns the root SceneInteface for this heirahcy. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
heirahcy -> hierarchy.
| virtual ConstSceneInterfacePtr scene( const Path &path, MissingBehaviour missingBehaviour = ThrowIfMissing ) const = 0; | ||
|
|
||
| /// Returns the root SceneInteface for this heirahcy. | ||
| SceneInterfacePtr getRoot() const { return m_root; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I've explained before, in Cortex and Gaffer, set/get are always paired and imply mutability of a member, so this should just be root(), since its immutable. That said, I don't get the need for this method - how is it different to calling this->scene( Path() )?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I've explained before, in Cortex and Gaffer, set/get are always paired and imply mutability of a member.
Let's hope you never have to mention this again m _ _ m.
That said, I don't get the need for this method - how is it different to calling this->scene( Path() )?
I added this for the reading tags in the LinkedScene case. What happens when I want to read a tag on a linked scene which was saved with v10?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm afraid I don't understand the details here. But lower down you say "Looks like I can do this - thanks for pointing it out.", in reference to the same question, so I'm a bit confused.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm afraid I don't understand the details here. But lower down you say "Looks like I can do this - thanks for pointing it out.", in reference to the same question, so I'm a bit confused.
I took another look at this->scene( Path() ) and I don't need to keep a root SceneInterface. We do require every class derived from SceneInterface to do something like keeping reference to the root or something similar.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I took another look at this->scene( Path() ) and I don't need to keep a root SceneInterface.
Ah, good! Does that also take care of your const-correctness issues, since there's a const and a non-const scene(), or is that separate?
We do require every class derived from SceneInterface to do something like keeping reference to the root or something similar.
Yep, they need something similar in order to implement scene(). But they've all avoided using a SceneInterfacePtr to the root, even though that might have been the obvious choice, so I do wonder if that was for good reason.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, good! Does that also take care of your const-correctness issues, since there's a const and a non-const scene(), or is that separate?
I think this will be OK, but as you've pointed out there are more important issues I should address first.
Yep, they need something similar in order to implement scene(). But they've all avoided using a SceneInterfacePtr to the root, even though that might have been the obvious choice, so I do wonder if that was for good reason.
Each concrete implementation of SceneInterface needs to either store some shared data or pointer to parent. For ::scene to be implemented we in SceneCache we need to walk parent pointers back to the root.
cortex/src/IECoreScene/SceneCache.cpp
Line 585 in e160d55
| ReaderImplementation *root = this; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, but for instance, Alembic Scene stores things the other way, and each location owns a map of its children :
Now it's not storing a map of SceneInterfacePtr so I think it would be OK, but if it did want to (and that doesn't seem unreasonable), then the root pointer would preclude that because it would cause a circular reference...
| } | ||
| } | ||
|
|
||
| IECore::CompoundDataBasePtr m_sets; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I can tell, this isn't used here in the base class, but is instead initialised in different ways in the derived classes. I'm guessing here, but in the reader I think it's used for converted tags? And then m_cachedSets is used for something else? If so, the naming is really unclear, as is having it in a class where it isn't used...
| return this; | ||
| } | ||
|
|
||
| void convertTagsToSets() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Presumably this should be private? Little things like that would make it a lot easier to wrap your head around this sort of thing...
| return nullptr; | ||
| } | ||
|
|
||
| ObjectPtr ptr = IECore::Object::load( sets, localSetsEntry ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks suspiciously like we load all sets at once? If so, I don't think that is acceptable from a performance standpoint.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks suspiciously like we load all sets at once? If so, I don't think that is acceptable from a performance standpoint.
We used to load tags for the entire heirachy up front, by virtue of them being defined as directories before didn't we? I'll separate this in to a directory per set.
Regarding performance, I think we need some specific performance regression tests which I'll add once the design is on a better footing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We used to load tags for the entire heirachy up front, by virtue of them being defined as directories before didn't we? I'll separate this in to a directory per set.
We did, but I think we can do better here. Also, we need to do better so that we can load the names of the sets without loading the sets themselves. Directory per set sounds like the answer...
| { | ||
| if( m_sets ) | ||
| { | ||
| return runTimeCast<CompoundDataBase>( m_sets ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like the intention is to avoid the work done later in this function, but then m_sets isn't updated later in this function at all. What is the intention?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like you missed this comment? For me, this was the crux for understanding (or not) what was going on here. I really couldn't follow what this was used for and when. My best guess was that it was only for converted tags, and that it would only ever be updated on the root?
I realise I actually don't even know why we store sets at all. It doesn't seem too unreasonable to just pay the cost whenever they are loaded, especially from a Gaffer standpoint where they'll be externally Gaffer anyway. If we were to cache them, then we should probably be doing it in a similar vein to SharedData::objectCache and SharedData::transformCache. These use a ComputationCache/ObjectPool combo for storage, so we have external control over how much memory is used, and we use an LRU strategy for throwing away stuff that hasn't been used for a while.
I think it'd be great to start with a much simpler stateless (as in no m_sets) implementation that just loads on demand and is easy to verify.
| nc = nc->getRoot(); | ||
| nc->convertTagsToSets(); | ||
|
|
||
| IECore::CompoundDataBasePtr compoundDataBase = getSetCompound(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wait, here we're loading the contents of all the sets, just to retrieve the names?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wait, here we're loading the contents of all the sets, just to retrieve the names?
Write
Yup - luckily this should be easy to fix.
| return; | ||
| } | ||
|
|
||
| m_sets = new CompoundData(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here we seem to be initialising m_sets on every location as we traverse through everything, but the only place I can find us putting anything into m_sets is in the root in appendToSet(). Is that right or have I misunderstood?
|
Hi John, Thanks for looking at this in it's current mess.
Doesn't it buy us flexibility to change dependent projects independently of cortex 10? Also, doesn't it allow us to implement sets in Alembic & USD and the tag API will work out of the box? Perhaps I'm being too cautious and we should change everything to use sets? Or keep both tags & sets in the files and slowly phase out tags?
Changes like this would require a full performance benchmark for sets / tags.
Looks like I can do this - thanks for pointing it out.
Should we discuss them as part of the PR or in person? |
Yeah, it does have some benefits, and like I said, I liked the idea at first. I do think this is a matter of opinion, and probably worth having Andrew's input. But after sleeping on it I find myself becoming more and more the devil and not just his advocate :
For the low level details, I think continuing the discussion in the PR might be the easiest so we can reference the code. There's a few unanswered questions up there about how Potentially the higher level discussion about whether we emulate tags for all formats would be better in person, or in a three way email with Andrew. I'm a bit of a mess today though - I'm sick and haven't really slept since 2am. I'm about to plead for a pass on today's Skype, but even if I don't get it I'm likely to just be a lifeless lump propped up in front of the screen. Maybe we should just put the pros/cons we've discussed in an email and run it by Andrew first? I could do a Skype another day once I've caught up on some sleep... Thanks for humouring me with this... |
No description provided.