Skip to content

Add platform-dependent Conda YAML and environments/module dependencies #47

@felker

Description

@felker

The current requirements-travis.txt in the root directory of the repository should be converted to a YAML format that complies with the Conda environment format. Especially note the version range formatting: https://docs.conda.io/projects/conda-build/en/latest/resources/package-spec.html#package-match-specifications

requirements.yaml should contain the Conda dependencies for a generic (GPU?) platform.

We will probably want to have custom Conda environments files for each of the following computers, mostly to handle the non-default Conda channels that may be necessary (e.g. IBM Watson AI for the 2x Power9 architectures with V100s). But we also want to set strict channel priority, be more specific about compatible dependency version ranges, etc.

  • Princeton Research Computing
    • Tiger 2/TigerGPU P100s
    • Traverse V100s
  • ALCF
    • Theta KNLs
    • Cooley K80s
  • OLCF
    • Summit V100s

Also, I am in favor of storing files such as traverse-env.cmd containing the following lines, e.g.:

#!/usr/bin/env bash

module load anaconda3
conda activate frnn  # must activate conda env before module loads
export OMPI_MCA_btl="tcp,self,vader"

module load cudatoolkit
module load cudnn/cuda-10.1/7.6.1  
module load openmpi/gcc/3.1.4/64
module load hdf5/gcc/openmpi-3.1.4/1.10.5

This will make it easier for the user to build FRNN on each platform after creating the Conda environment, e.g. from the new directory named envs/ or environments/:

conda env create --file envs/requirements-traverse.yaml
# alternative: "conda create --name frnn --file traverse.yaml"
source traverse-env.cmd
python setup.py install

See Sample Installation on TigerGPU, for example.

Also, examples/slurm.cmd can source the exact same modules in the .cmd file for consistency.

However, this will require frequent updates to the *.cmd files as system admins upgrade the modules and libraries on the various platforms.

  • Should the .yaml and .cmd pairs of files per platform be all stored in a top-level subdirectory, e.g. environments/, env/, or envs/? How do other projects handle this problem and organize such files? Or do they typically only define a single Conda YAML?
  • Have examples/slurm.cmd check the hostname in order to automatically source the correct .cmd?
  • Should the .cmd environment files assume that the Conda environment is named frnn? Or force the user to define an environment variable FRNN_CONDA_ENV?
  • Update PrincetonUTutorial.md after these changes are made
  • Remove all version info from list of dependencies in setup.py.
  • Add optional feature dependencies to setup.py

Current limitations of Conda YAML format:

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions