Skip to content

scripts for rebuilding a dev environment (with sample data) #7256

@pdurbin

Description

@pdurbin

Yesterday in tech hours we talked about things to help developers and I brought up what I think of as the "rebuild with sample data" scripts I've been using for years and years. (I stopped using them only recently because we switched to Payara.)

There are lots of reasons developers might want to rebuild from time to time:

  • You've been running integration tests and all your "nice" datasets are buried beneath them.
  • You're ready for a clean start with known (or no) data.
  • You're having trouble with database upgrade scripts.

By rebuild I don't mean that every single dependency is removed. In fact, the application server (Glassfish or Payara) is largely untouched in my scripts. That said, a number of major changes are made:

  • The database is dropped.
  • The data files are deleted.
  • Solr is cleared out.

These scripts eventually evolved into the ones we used on the phoenix server for many years. This server was rebuilt as described above on every run. The scripts can be found at https://github.com/IQSS/dataverse/tree/develop/scripts/deploy/phoenix.dataverse.org

After "rebuild" has run, the "post" scripts gets executed, and starts with some setup...

  • Run setup-all.sh.
  • Run SQL scripts (reference data and create sequence).
  • Set DOI provider to FAKE.

... and then continues on to load some sample data. As I mentioned on the call, these scripts create some "birds and trees" users, dataverses, and datasets. (Even though our sample data repo is newer, it doesn't create users.) The Spruce Goose dataset (screenshot below) might be familiar.

When estimating this issue, here are some questions to consider:

  • Do we want the "birds and trees" data? Or would we rather have the sample data? Or both?
  • Should we defer worrying about sample data until a later issue?
  • Should we consider adding a rebuild or reinstallation of Payara as part of this? Or should we stick to the model above?
  • Should we use the bash scripts above as a starting point? Or should this be a (dangerous!) feature of the installer?

Screen Shot 2020-09-09 at 2 20 18 PM

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions