-
Notifications
You must be signed in to change notification settings - Fork 2
adjust specification and implementation based on solid-efforts #19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use of multiple roles
We need to have a discussion about what information goes in the catalog and what does not. I am in favor of having a single provider/publisher field (with possible multiple values) so we are able to tell what organization(s) or person(s) is primarily responsible. I am not in favor of adding in maintainers, contributors, authors, editors etc. All of that information is available on the link from the catalog to the resource.
There are two reasons I feel this and these are based on multiple catalogs and directories I have built (including one which has been continuously in print for over 40 years).
a) Getting people to fill out data for themselves is a huge block. We have less than a dozen in the month the forms have been available. The more fields they have to fill out, the less likely they will complete the forms.
b) Roles such as maintainers and contributors are likely to change often. People leave and people arrive in projects. It is very unlikely we will be able to maintain these kinds of records in any credible way.
I would prefer either a single publisher field pointing to e.g. the CG for specs. The organization that provides something is generally useful and easily absorbable information. To whom is the information about what individuals occupied which roles useful? For those it is, the information is readily available on the links from the catalog.
On the record for solidcommunity.net I want to know that ODI provides it. I do not care who on ODI staff is maintaining. Is someone at ODI going to remember to change that if staffing changes? Is someone at CSS going to change their list of contributors when their staffing changes?
Stale information is worse than no information and far worse than a link to the correct information.
referring to ProductClass
I asked you how you would handle ClassOfProduct and instead of answering you propose changes without explaining them. What predicate will be used to point to the ClassofProduct? Why is it a shape when all of the examples you gave list the classes as SKOS concepts.? Why wouldn't this be handled the same way as e.g. subTypes of Servers with the various product classes as the skos:Concepts? So if this was like the existing shacl it would be something like
#ServerShape sh:targetClass spec:Server ;
[
sh:path ex:productClass ;
sh:in (
<https://solidproject.org/TR/notifications-protocol#ResourceServer>
<https://solidproject.org/TR/notifications-protocol#SubscriptionClient>
...
LearningResource
LearningResources are things like written documents and videos, they don't have maintainers and developers.
We should also clarify if the long term goal is for everyone to self publish those descriptions. I would imagine that gradually more information will be self published and catalog will host only 'incubating' resources and simply aggregate more mature ones. In that case we will have to work based of self published information rather than expect people to enter anything more than the IRI denoting their self describing resource.
All properties proposed in this PR are optional, but your point might still hold even if big form may look intimidating. At the same time if we aim for self publishing, we probably should not assume a single app providing forms, catalog might provide 'incubation' storage for resources and maybe sooner than later encourage self publishing. Of course we need to balance it with breaking changes to shapes and availability of tools to migrate older version of data.
I think this comes back to self-publishing and who updates the canonical description.
Deployments are more complicated and out of scope for this PR. I only touch specifications and implementations here.
Github shows it weird, I added it on the
I don't think enumeration makes sense here, we would need to extend it everytime new draft is created or classes of products get reorganized in existing drafts. Why would you want to have such enumeration?
Please see the diff
We should also discuss possibility of adding them to |
We can eventually incorporate self-published data for specs, implementations etc. but what would self-publishing for an event, a video, a forum, or other types of assets look like and who would maintain it?
Getting that in place will be even harder than getting people to fill out the forms. That is, at best a multi-year goal. Once that is in place, all of the maintainer etc. information will be available regardless of whether we try to include it now and including now will lead to stale data and large differences in the completeness of records.
Ieben is working on new forms which will go directly into the dataset. Yes, when there are multiple apps, they might each have their own viewer and forms. I am mostly concerned with the core app which will live on solidproject.org. I think multiple forms with varying fields going to that dataset sounds incredibly messy. Perhaps this is a case where each app has its own extended SHACL and forms that can be added to the core SHACL.
Again, once self publishing is in place we won't need forms for those that self-publish. Having those fields in the current SHACL does nothing to further that goal.
The same principle holds for implementations. If I know that Inrupt created it, I don't really need to know what staff was responsible and if I care, I can look at the github repo pointed to from the catalog.
Ah okay. Then I only object to it on the basis of too many fields not on being misapplied to LearningResources :-).
These don't have to be enumerations. We can have a skos:Concept ServerProductClass with it being in a broader relationship to NotificationServer, ResourceServer, etc. and in the SHACL say that the range is things that are narrower than the ServerProductClass concept. We would have to maintain the SKOS but we wouldn't have to change the SHACL. SKOS is an integral part of DCAT and it makes sense to me to use it. It apparently made sense to @csarven when he made them concepts in the spec.
Okay, I see now. There is dcterms:conformsTo but yes, maybe adding them to spec: is a good idea. |
I would imagine those forms in Solid apps which by design should let people using them to publish wherever they choose. @csarven if you happen to look here first, I also posted some notes and mentioned you on solid/specification matrix https://www.w3.org/TR/skos-reference/#L1045
|
|
[EDIT: nevermind, sorry, saw it in the PR]
So triples would be like this? AppA conformsTo ProductClassB .
SpecX definesConformanceFor ProductClassY . |
In that file you have accesses: [
'https://data.example/Tasks',
'https://data.example/Images',
]I am currently handling those as keywords in order to solicit responses from contributors to see how they want to describe themselves. Once we have enough input, we can make them IRIs. We still need to discuss what the objects can be - media types? application types? rdf:Classes? shapes? So I'm thinking keep these as keywords for now. But I'm open to ideas. |
|
Re @elf-pavlik's thoughts on SKOS - see https://interoperable-europe.ec.europa.eu/collection/semic-support-centre/document/skos-used-publish-controlled-vocabularies-defined-adms-web . There are other places SKOS is mentioned in DCAT as well. The metadata for a DCAT:Catalog inlcudes the predicate dcat:themeTaxonomy which is meant to point to a SKOS ConceptTree which defines themes (essentially subtypes or categories of types) used throughout a DCAT catalog. You call ConceptTree's "informal" but my understanding is that, when done correctly, and they have a permanent URI, that they are formal controlled vocabularies and are used as such throughout the SEMIC. |
|
May I ask why "ClassOfProduct" instead of "ProductClass" which is more concise? |
|
The more I think about it, the less I believe the ProductClass belongs in the core catalog and I suggest we create a second extended SHACL resource to include these and other things which are primarily of interest to specification writers. Product Classes for clients are particular weak, they are: What useful information is gained from knowing something is a WriterApplication or DiscoveryClient and who is it useful to? And for servers, I think the useful thing for pod-providers, developers, and self-hosters to know is what capabilities the server has. Given server extensions these could include things like search and data wallets which are not covered in the specs and are not product classes. I do think we need to have a "capabilites" field on servers to capture these and some of the product classes could be considered as valid values for that but so would things that are not product classes. |
I don't care much about the specific URI component, I could even go with wikidata style P1234 for properties and Q1234 for classes if we have hard time picking names.
You only looked at two drafts Solid Notification Protocol and WebID Profile. Across all the drafts listed in https://solidproject.org/TR/ there are many more classes of products. One of the good examples would be Solid-OIDC Client which in my viewer would currently show two JS and one Java implementations
I think this information is of interest for spec writers and implementers, we can discuss different audiences and different viewers tomorrow. My current approach is that no single viewer has to show all the data in the catalog. Each one an only show a selected subset of it for the specific audience. When it comes to adding the data to the catalog and self publishing. Product classes will be maintained by spec writers so I wouldn't worry about possibly intimidating UI since broader community will not enter that information. Your viewer doesn't need to deal with forms for specifications, I could provide forms for that. People who maintain implementations will just link to specific classes of product from descriptions of their implementations. If they publish results of conformance tests this information could most likely be ingested from there and they wouldn't even need to look at any forms with classes of products. |
I don't object to the specs having this, there are only four fields currently, another will not matter and the burden will not be on implementers to fill it out.
Could you explain that in more detail. |
While prototyping, I artificially broken up CSS into separate pieces, I don't think I want to keep that attempt so all the other classes of products from all the other CSS entries here would go to the list above: https://elf-pavlik.github.io/solid-efforts/#/?tab=services Even if implementers currently don't run conformance tests, or such tests haven't been yet created, they should have a pretty good idea which classes of products from which specs they implemented or are implementing. |
|
I see the value in Servers and Server extensions and Specs having Product Classes but the value with client applications eludes me. Other than the three Solid-OIDC clients, which client apps would have meaningful values in Product Class? |
I wouldn't expect that applications would directly implement them, instead they would just use a package/module that implements it. https://elf-pavlik.github.io/solid-efforts/#/?tab=modules This is also a reason that I draw a clear distinction between app developers and shared libraries developers in |
|
Okay, makes sense for libraries too (which I also am careful to distinguish from apps). |
|
Okay, you've convinced me of the usefulness of Product Class. My remaining objections are primarily about the extra contributor fields. I guess for specs, the editors/authors are known and change rarely and the spec form is not crowded, and if we only had publisher on them, they'd all be the CG. So I have no objection to those additions. I'm still unconvinced about maintainer/developer on implementations. Let's revisit after tomorrow's meeting. |
For the reference, besides I would map it
I don't like that I would expect @jeff-zucker would you list someone who makes occasional contributions (PRs) as |
No I would list there the organization(s) that publish the software, similar to schema.org's definition for provider : "The service provider, service operator, or service performer; the goods producer. " If the resource is not affiliated with an organization and there are clear authors, I would list them. In group projects where there is not a limited set of key authors, I would put something like "Community Coding Group". I would leave information about other kinds of contributors out of the catalog and assume anyone interested could easily find the information on github or other link provided in the catalog. One advantage of doing it this way is that github shows the relative contributions of individuals so minor contributors would be listed but it would be clear that they are minor contributors. |
|
Okay, my last remaining hesitation about this PR - can we hold off on changing the names of terms and address that as an issue in itself? If we leave everything in the ex: namespace for now (i.e. ex:maintainer rather than doap:maintainer), we can address terms as a whole rather than piecemeal and the SHACL can continue to serve v2. Once we have the shapes the way we want them, we can look at ontologies and terms across the whole dataset and make a single migration script to move away from the ex: namespace. If that's acceptable to you, I'll approve the PR and then create another to switch back to the ex: namespace. |
👍 I was planning to add a commit reverting namespace changes but got sidetracked, will do it in a moment From out of bound conversation:
Could you please clarify, maybe we can resolve this one as well. |
|
As I said, I'm okay with this for now when we are gathering input as strings, but at some point we need to be able to check and see - is the entry for ClassOfProduct actually an existing Product Class and where is that defined. In other words it needs to be an IRI. Since the specs use SKOS to define the classes and since SKOS is used elsewhere in the catalog I would think it is a natural fit for that, but perhaps there are other options. If we do not have a drop-down presenting the valid ProductClasses, how can we validate what people enter into the field? |
jeff-zucker
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good to go!
|
Each specification defines its classes of product, we should expect them to be added together with the Specification. Later implementations can simply reference them. |
I'm opening it as draft for discussion. It is a partial followup on #18
Informal SKOS and formal RDFS/OWL can coexist so for simplicity I'm adding minimal number of terms.
Those changes would practically allow me to migrate those two dataset class partitions