Skip to content

Binderverse: integrating Binderhub with Dataverse (using docker+kubernetes) #4714

@aculich

Description

@aculich

Issue #4665 would help us in our work to integrate Dataverse with Binderhub.

Our goal with the integration is to allow anyone browsing datasets in Dataverse to instantly launch a Binderhub-based reproducible compute environment. The Binder technology is a Jupyter project that allows researchers to easily specify software requirements which get automatically built as docker containers spawned into a Kubernetes-based Jupyterhub environment.

Our exploration during Spring 2018 of this new feature/integration included two undergraduate students installing and modifying Dataverse; see this example repo: https://github.com/sean-dooher/binderverse

We would like to be able to either launch Binder from an existing Dataverse instance by adding a launch binder button in the Dataverse UI itself. The first iteration hacked the Dataverse code directly to add the button in, and then further discussion on the Dataverse forum about the External Tools Dataset Extension resulted in this final proof-of-concept at the end of the semester.

We would also like to enable mybinder.org to accept a DOI as input (instead of a github url) to automatically find a valid Dataverse instance that contains the data and a requirements.txt (and other related binder files):

Example Integration

for this latter DOI functionality we've also explored an integration of Binderhub+Dataverse+OSF (Open Science Framework).

If we resume this work it would be helpful to have an easy to deploy dockerized version of Dataverse to simplify rapid-prototyping initially— and in the long run we'd also like to run it in production on top of Kubernetes.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    Status

    Done

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions