Conversation
7bbea9b to
1992d29
Compare
051ede0 to
18cbf90
Compare
18cbf90 to
b1da52a
Compare
0b0ddaf to
9f5cbc1
Compare
33ebc35 to
2671df4
Compare
include/openPMD/Iteration.hpp
Outdated
| private: | ||
| Iteration(); | ||
|
|
||
| struct DeferredRead |
There was a problem hiding this comment.
Let's add a doxygen string here, just to clarify what this switch does.
Would DeferredReadAccess be more explicit? Not sure, ... maybe.
There was a problem hiding this comment.
If we rename this one anyway, I'd go for DeferredParseAccess to avoid confusion with deferred load_chunks
There was a problem hiding this comment.
The implementation of the lazy/deferred parsing reads well. One general comment, I would unify the naming for this in the code base, at the moment we use:
- lazy
- deferred read
- deferred access
- (not) yet accessed
To describe the same thing, which can be confusing as the same concept in the code is meant.
Should we skip the introduction of the term "lazy" and instead use "deferred (iteration) access" in all places?
Do we want to make this a Series constructor parameter or JSON option?
Should we introduce the builder pattern in a separate PR and/or use it more consistently throughout the code base & examples?
7e2e3f6 to
0c481b4
Compare
franzpoeschel
left a comment
There was a problem hiding this comment.
The implementation of the lazy/deferred parsing reads well. One general comment, I would unify the naming for this in the code base, at the moment we use:
* lazy * deferred read * deferred access * (not) yet accessedTo describe the same thing, which can be confusing as the same concept in the code is meant.
Should we skip the introduction of the term "lazy" and instead use "deferred (iteration) access" in all places?
I've renamed things now:
parse_lazily -> defer_iteration_parsing
NotYetAccessed -> ParseAccessDeferred
Iteration::deferRead -> Iteration::deferParseAccess
DeferredRead -> DeferredParseAccess
m_deferredRead -> m_deferredParseAccess
parseLazily -> runDeferredParseAccess
Do we want to make this a Series constructor parameter or JSON option?
Good idea, done
Should we introduce the builder pattern in a separate PR and/or use it more consistently throughout the code base & examples?
I introduced the builder pattern since it made for a better way to have a constructor with more than one defaulted argument. Since we're now back at one defaulted argument, we don't necessarily need it anymore. I've isolated the builder pattern to a branch franzpoeschel/topic-builder now, should I do a PR?
Also, we can definitely use it more in the examples and tests. In the code base itself probably not so much since we seldomly create a Series object ourselves.
include/openPMD/Iteration.hpp
Outdated
| private: | ||
| Iteration(); | ||
|
|
||
| struct DeferredRead |
There was a problem hiding this comment.
If we rename this one anyway, I'd go for DeferredParseAccess to avoid confusion with deferred load_chunks
include/openPMD/Series.hpp
Outdated
| std::unique_ptr< ParsedInput > parseInput(std::string); | ||
| // use a template in order not to expose nlohmann_json to users | ||
| template< typename JSON > | ||
| void parseJsonOptions( JSON const & ); |
There was a problem hiding this comment.
Since you only call this in the .cpp file, you can define this helper function only in the Series.cpp file and in a namespace {} (as you already do) and avoid that it gets exposed as a function and symbol at all :)
There was a problem hiding this comment.
It's not that easy because that function accesses private members. But we can theoretically make all members of SeriesData public and inherit it privately in SeriesInternal. So, the data struct can be passed to non-friend functions for manipulation?
Edit: Pushed a commit that does just that.
Edit: Not so easy either, this breaks dynamic casting. I've made it a public inheritance now, meaning that SeriesInternal has all members public now. Series still hides them.
| uint64_t iteration{}; | ||
|
|
||
| // support for std::tie | ||
| operator std::tuple< bool &, int &, uint64_t & >() |
First attempt at an implementation of lazy parsing todo: * file-based layout * iterator access Basic file-based deferred parsing Remove changes to Container class Read eagerly by default Read iteration upon Iteration::open Expose to frontend and add some tests Use Builder Pattern for Series
In that case, we must not attempt to piece back together the filename by filebased methods, but instead directly use the filename specified by the user.
parse_lazily -> defer_iteration_parsing NotYetAccessed -> ParseAccessDeferred Iteration::deferRead -> Iteration::deferParseAccess DeferredRead -> DeferredParseAccess m_deferredRead -> m_deferredParseAccess parseLazily -> runDeferredParseAccess
This makes all members of SeriesData public, but Series still hides them.
Co-authored-by: Axel Huebl <axel.huebl@plasma.ninja>
6ded0af to
e99630c
Compare
Opening a Series with many iterations is currently a time-intensive procedure due to eager parsing.
This adds an option to lazily parse a Series. If that option is chosen,
Iteration::open()must be called before accessing.ReadIterationswill do that automatically.I previously attempted parsing iterations automatically upon accessing them via the
Containerinterface, but there were too many edge cases and it's probably better to do such things explicitly in parallel situations.TODO:
including removal of failed first attemptauxiliary::Optioninstead of NaNs to represent no scaling.