Skip to content

Feature: Adding metadata subgraphs and search over them  #45

@kurzum

Description

@kurzum

This is regarding https://moss.tools.dbpedia.org/ , which we will need to implement lege artis, but also a fast demo/working instance as a patch to the existing would be beneficial. Ideally, both fall together.

Description

The main wanted feature is to be able to add further metadata, i.e. a graph of one or several triples that use Databus identifiers either in the triples or as named graph, so they are clearly linked to the Databus. We can name them additional custom metatada (ACM) graphs. Besides the ability to use SPARQL, it would be good to include some of the data in a search index (keywords/autocompletion/facets).

Challenges

The issue comprises two challenges:

  1. where to store the data and allow users to add more metadata. The lifecycle is important here as this is defined by the modality of upload
  2. how to index it best for search. This might include building special indexes or facets for certain metadata such as ontology classes and other URIs.

A note here: while adding additional custom metadata (ACM) graphs would be the full feature, just adding URI tagging would be a smaller, but maybe viable feature, i.e. <version> :containsSomethingAbout <Windenergy>.

Options

Main Upload

During group/artifact/version upload, see quickstart, users could add more context and some more lines to the json-ld. Additionally, admins could extend the default SHACL to validate. Triples are stored in the Databus store.

| + | already possible, but not searchable|
| + | no federation necessary|
| + | usability is simple once a new upload format is defined and documented, clear relation to the Databus Ids |
| + | ? seems like it can be added to databus main search easily|
| +/- | authorization is bound to the owner account, so it is clear, but limited (only owner can add/edit additional metadata) |
| - | additional metadata lifecycle is bound to the group/artifact/version, necessity to overwrite metadata to change it|

Datawiki

Since we already have GSTORE, it would be easy to implement an additional Datawiki, storing more RDF documents. This might be achieved by defining new graphs, e.g. under additiona custom metadata (acm) https://databus.dbpedia.org/acm/$graphid then storing documents there. The workflow would be open to all users like a wiki, where revisions are kept and revert is possible. Triples are stored in the Databus Store

| + | wiki lifecycle for additional graphs, i.e. separate from the artifact lifecycle|
| + | no federation necessary|
| + | ? seems like it can be added to databus main search easily|
| +/- | possible, but some work required|
| +/- | authorization is more liberal, but not well-defined yet|
| - | might be exploited to add a lot of data/extra graphs by users, might need some config by admin|
| - | needs some implementation
| - | relation to Databus Id might be established over the graph name, which is not ideal.
|---| extra component to maintain

Mods via push to master

Marvin suggested that a push feature for the mod master could be used to allow users to post graphs connected to Databus IDs. Then the data would be transfered from mod master to mod worker and handled there, e.g. stored, validated, etc. This would mean hosting the data in a separate triple store and also might need additional access control. Besides handling the ACM graphs, allowing a push to mods seems usefull in general, also making mod data (i.e. the autogenerated one) more searchable and also visible on the bus also has good merits.

| + | follows the databus design|
| - | federation
| - | we need to tackle, that mods are more visible in the databus

External app

This could be build completely externally.

ChatGPT answers

ACM-Feature.md

Metadata

Metadata

Assignees

No one assigned

    Labels

    OEPfeaturemossmetadata overlay search system

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions