-
Notifications
You must be signed in to change notification settings - Fork 1
make the form of the vodml-ids normative #46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: 20-update-vo-dml-standard-document
Are you sure you want to change the base?
make the form of the vodml-ids normative #46
Conversation
updates the document for #18
|
|
|
Looking at this in the co text of CAOM-2.5, I have noted that I have used vodml-id(s) and utype(s) in tap_schema inconsistently and they should be related: specifically, the canonical tap_schema should be derived from the model. Is that what you have in mind as well? If I grok the proposed rules/BNF, then these two examples should illustrate: vodml-id First the scheme prefix: utype style is technically a vodml-ref rather than vodml-id, which is probably conceptually correct. Second, the rest: historically, utype style also tended to encode the path of composition names rather than the simpler type-name + attr name. The second example above shows this mismatch. With VO-DML-1.0 I could chose the style, and hence fix the model or the tap_schema. With this rule in place as-is, the vodml-id(s) would have to conform and the utype(s) would have to change to match. The utype style does tend to make rather longer utype strings than necessary but it provides some context / readability... at least that's probably the intent/idea behind writing them that way. The vodml-id rules as written would make for simple, more or less finite-length vodml-id (and hence utype) values (finite meaning independent of the complexity/nesting/hierarchy of the model and just based on sane naming). thoughts? |
yes the current version of the tooling does produce a TAP schema automatically from the VO-DML (with any possible variations in style tied down in the binding) |
I thought that one of the motivations behind VO-DML was to bring some rigour and uniformity to UTypes as a result of https://www.ivoa.net/documents/Notes/UTypesUsage/index.html - but a consequence of this will mean that backwards compatibility will be impossible to guarantee for models that declared UTypes that did not conform to the new style. |
It should now be possible to rigorously define utypes as "shortcut" references into a VODML-compliant model, at least for backwards compatibility. I am not sure how useful this is unless the mapping is standardized. The main obstacle to using utypes was that the same attribute of the same type, say a sky position, when appearing in multiple parts of a model, let alone across models, would end up having hundreds of different utypes. This stemmed from a concrete problem when we were working on the Spectral/SED models and applications. Since the consensus was that utypes weren't to be parsed, a client capable of dealing with sky positions couldn't try and figure out that Now, a client using utypes would need to match a string like Is this worth the effort? I am not really sure. |
|
I should add that the reason we left the IDs as opaque string was also to allow the freedom that @pdowler was mentioning, so I sympathize with the objection to enforcing a specific grammar. |
|
As I said in #18 the tooling does not actually directly use the I personally think arbitrary |
|
I'm not really objecting to vodml-id being specifically defined, just commenting that the short specific definition (which would lead to a vodml-ref like They are both unique and specific, the vodml-ref is shorter (length independent of model complexity) which I thick is a feature. The classic utype is longer (can get quite long and cumbersome) but I guess feels like it conveys some context info. That's probably why people wrote them like this by hand. So at this point, I'd probably agree with the shorter one as expressed in the PR (but see my next point below). |
|
Here is the BUT :-) and I hope it makes sense Then there is the way that I gave used DataType in the caom2 model. Example ObjectType
Let's say I have a TAP service with one column for the interval value (double[2] in PG). That column would have But, what if another implemention of caom2 wants to store the interval in two separate columns (type double). Then by the spec these columns would have TL;DR - utype==vodml-ref + re-use of dataType(s) + implementation decision -> non-unique utypes Which of those has to give? I do not want to try to define utype again because I thought we had :-) |
|
So I think that the above example basically means that if you want to convey the a UType of As I said above internally in VO-DML documents and between VO-DML documents the tooling just uses Incidentally the VO-DML standard says almost nothing about UTypes and I am really not sure that they are that well defined anywhere in an IVOA standard. |
|
On Fri, Mar 21, 2025 at 05:29:15AM -0700, Paul Harrison wrote:
Incidentally the VO-DML standard says almost nothing about UTypes
and I am really not sure that they are that well defined anywhere
in an IVOA standard.
They are not. The closest we've got was when the utypes tiger team
produced a survey of what folks believed utypes should be,
https://ivoa.net/documents/Notes/UTypesUsage/.
And then everything went up in flames. This means that to this day
there is nothing normative on, say, what the "prefix" means (this, by
the way, is what got *me* into this game), who gets to sanctify
or mint or whatever utypes, whether by industry standard they have to
be longer than 100 characters, whether to compare them
case-insensitively, and so on.
In the meantime, there are a few more cases where we're also using
utypes; RegTAP stores xpaths in there, and TableReg
<https://ivoa.net/documents/Notes/TableReg/20240821/> has ivoids.
Perhaps this would be a good opportunity to update the usage note.
However, I'd say the bottom line is: do whatever you need to with
utypes. There are so many conflicting practices that another one
won't hurt.
|
|
Actually I have just noticed on p 20 of the 1.0 spec
perhaps this text should be moved up to the main vodml-id section too. |
|
As I mentioned, without changing anything in the document it should be possible (but I haven't tried in the past 10 years) to define a mechanism to declare utype-style strings based on a vodml-compliant model. A well-defined model like caom2 would be a perfect benchmark. This could be based on some kind of heuristics rather than a standard, for simplicity. [Even for an old model that only declares utypes (and some hand-wavy set of boxes and arrows), one could probably reverse-engineer the utypes into a compliant model, so that the utypes remain the same. We probably did that in the Spectral/SED context in a previous lifetime.] The point is that vodml ids and refs are supposed to be opaque, and they could still be so, leaving people the ability to produce them in a more user-readable fashion, like I understand caom2 does. This saves the opaqueness, "algorithmical" nature of the vodml ecosystem, while also producing more humanly appealing strings, if one wishes to do so. These are still not utypes, because utypes are supposed to be a single string that conveys a lot of information. These utypes have issues in more complex cases [unless one starts to use a parsable grammar for e.g. arrays, like For cases in which one doesn't need GROUPs of related data elements with the same utype, a simple heuristics that maps a human-readable utype string to a compliant model element could be enough. So wherever in the IVOA standards a utype is still required, these utypes could rigorously (or even just heuristically) point to an element of a compliant model. To recap, the idea behind the utype was to have a single non-machine-parsable human-parsable string that would convey information about the type of a data element and its role in a complex structure. This was impossible to do, in a rigorous way, with just one string. VODML now provides a way for that string to be what it was supposed to be, a pointer to a model element, its type and its role. It's possible that the simplest way to do this is what caom2 does, and to produce vodml-ids and vodml-refs in such a way that the same string can work both as a utype and a vodml-ref, and that in order to do so one shouldn't make Appendix C normative. |
|




updates the document for #18