diff --git a/.github/workflows/flashx.yml b/.github/workflows/flashx.yml index e4d5494f..fd290c98 100644 --- a/.github/workflows/flashx.yml +++ b/.github/workflows/flashx.yml @@ -6,6 +6,7 @@ on: - main - development paths-ignore: + - '**.rst' - '**.md' - 'LICENSE' - 'CITATION' diff --git a/.github/workflows/flowx.yml b/.github/workflows/flowx.yml index 6963cffd..d56a1ecd 100644 --- a/.github/workflows/flowx.yml +++ b/.github/workflows/flowx.yml @@ -6,6 +6,7 @@ on: - main - development paths-ignore: + - '**.rst' - '**.md' - 'LICENSE' - 'CITATION' diff --git a/.github/workflows/linting.yml b/.github/workflows/linting.yml index b42b75b4..e9723b25 100644 --- a/.github/workflows/linting.yml +++ b/.github/workflows/linting.yml @@ -6,6 +6,7 @@ on: - main - development paths-ignore: + - '**.rst' - '**.md' - 'LICENSE' - 'CITATION' diff --git a/.github/workflows/minimal.yml b/.github/workflows/minimal.yml index f495358f..f4e0c601 100644 --- a/.github/workflows/minimal.yml +++ b/.github/workflows/minimal.yml @@ -6,6 +6,7 @@ on: - main - development paths-ignore: + - '**.rst' - '**.md' - 'LICENSE' - 'CITATION' diff --git a/.github/workflows/publish.yml b/.github/workflows/publish.yml index 44e0c3c3..cbd2dd49 100644 --- a/.github/workflows/publish.yml +++ b/.github/workflows/publish.yml @@ -7,6 +7,7 @@ on: # branches: # - main paths-ignore: + - '**.rst' - '**.md' - 'LICENSE' - 'CITATION' diff --git a/README.rst b/README.rst index 58a59071..0e815767 100644 --- a/README.rst +++ b/README.rst @@ -10,10 +10,10 @@ Overview ********** -An overview of BoxKit is available in ``paper/paper.md`` that can be -compiled into a Journal of Open Source Software (JOSS) pdf by running -``make`` in the ``paper`` directory. Please note that the ``Makefile`` -requires a functioning Docker service on the machine. +An overview of BoxKit is available in ``paper/paper.md`` that provides a +summary and statement of need for the package. You can compile it into a +pdf by running ``make`` in the ``paper`` directory. Please note that the +``Makefile`` requires a functioning Docker service on the machine. ************** Installation @@ -26,9 +26,6 @@ Stable releases of BoxKit are hosted on Python Package Index website pip install BoxKit --user - export CXX=$(CPP_COMPILER) - pip install BoxKit --user --install-option="--enable-testing" --install-option="--with-cbox" - Note that ``pip`` should point to ``python3+`` installation package ``pip3``. @@ -38,9 +35,35 @@ using, .. code:: pip install --upgrade BoxKit --user - pip install --upgrade BoxKit --user --install-option="--enable-testing" --install-option="--with-cbox" pip uninstall BoxKit +Pre-release version can be installed directly from the git reposity by +executing, + +.. code:: + + pip install git+ssh://git@github.com/akashdhruv/BoxKit.git --user + +BoxKit provides various installation options that can be used to +configure the library with desired features. Following is a list of +options, + +.. code:: + + with-cbox - With C++ backend + with-pyarrow - With Apache Arrow data backend + with-zarr - With Zarr data backend + with-dask - With Dask data/parallel backend + enable-testing - Enabling testing mode for development + +Correspondingly, the installation command can be modified to include +necessary options as follows, + +.. code:: + + export CXX=$(CPP_COMPILER) + pip install BoxKit --user --install-option="--enable-testing" --install-option="--with-cbox" + There maybe situations where users may want to install BoxKit in development mode $\\textemdash$ to design new features, debug, or customize classes/methods to their needs. This can be easily @@ -62,7 +85,49 @@ source code and is an effective method for debugging. Note that the ******* Usage ******* -Add usage details here + +After ``pip`` installation, BoxKit can be imported inside Python +environment by adding the following to iPython notebooks and scripts, + +.. code:: python + + import boxkit + +Once the library is imported in the environment, simulation datasets can +be read by executing, + +.. code:: python + + dset = boxkit.read_dataset(path_to_hdf5_file, source="flash") + +New datasets can be created using the ``create_dataset`` method + +.. code:: python + + dset = boxkit.create_dataset(*args, **kwargs) + +A full of list of arguments can be found in the documentation. + +************* + Performance +************* + +|performance| + +********* + Testing +********* + +Testing for BoxKit is performed across different hardware platforms +where high-fidelity simulation data can reside. The sites $\\textemdash$ +acadia and sedona refer to a Mac and Ubuntu operating systems +respectively where regular testing takes place. + +For lightweight testing during pull requests and merger, new tests can +be added to ``tests/container``. Each test should be accompanied with a +coresspoding addition to YAML files located under ``.github/workflows``. +See ``tests/container/heater.py`` and ``.github/workflows/flashx.yaml`` +for an example. ********** Citation @@ -70,23 +135,23 @@ Add usage details here .. code:: - @software{akash_dhruv_2022_7255632, + @software{akash_dhruv_2023_8063195, author = {Akash Dhruv}, - title = {akashdhruv/BoxKit: October 2022}, - month = oct, - year = 2022, + title = {akashdhruv/BoxKit: June 2023}, + month = jun, + year = 2023, publisher = {Zenodo}, - version = {22.10}, - doi = {10.5281/zenodo.7255632}, - url = {https://doi.org/10.5281/zenodo.7255632} + version = {2023.06}, + doi = {10.5281/zenodo.8063195}, + url = {https://doi.org/10.5281/zenodo.8063195} } ************** Contribution ************** -Contribution to the source code is encouraged. Developers can create -pull requests from their individual forks to the ``development`` branch. +Developers are encouraged to fork the repository and contribute to the +source code in the form of pull requests to the ``development`` branch. Please read ``DESIGN.rst`` for an overview of software design and developer guide @@ -94,7 +159,8 @@ developer guide Help & Support **************** -Please file an issue on the repository page +Please file an issue on the repository page to report bugs, request +features, and ask questions about usage .. |Code style: black| image:: https://img.shields.io/badge/code%20style-black-000000.svg :target: https://github.com/psf/black @@ -111,3 +177,6 @@ Please file an issue on the repository page .. |icon| image:: ./media/icon.svg :width: 30 + +.. |performance| image:: ./media/performance.png + :width: 1000 diff --git a/media/performance.png b/media/performance.png new file mode 100644 index 00000000..ec48802b Binary files /dev/null and b/media/performance.png differ diff --git a/paper/paper.bib b/paper/paper.bib index 9d0e3a77..73a9921b 100644 --- a/paper/paper.bib +++ b/paper/paper.bib @@ -43,3 +43,22 @@ @misc{argonne year = 2023, url = {https://www.anl.gov/topic/business/laboratory-directed-research-and-development-ldrd} } + +@ARTICLE{yt, + author = {{Turk}, M.~J. and {Smith}, B.~D. and {Oishi}, J.~S. and {Skory}, S. and +{Skillman}, S.~W. and {Abel}, T. and {Norman}, M.~L.}, + title = "{yt: A Multi-code Analysis Toolkit for Astrophysical Simulation Data}", + journal = {The Astrophysical Journal Supplement Series}, +archivePrefix = "arXiv", + eprint = {1011.3514}, + primaryClass = "astro-ph.IM", + keywords = {cosmology: theory, methods: data analysis, methods: numerical}, + year = 2011, + month = jan, + volume = 192, + eid = {9}, + pages = {9}, + doi = {10.1088/0067-0049/192/1/9}, + adsurl = {http://adsabs.harvard.edu/abs/2011ApJS..192....9T}, + adsnote = {Provided by the SAO/NASA Astrophysics Data System} +} diff --git a/paper/paper.md b/paper/paper.md index 960ec06c..ea24af82 100644 --- a/paper/paper.md +++ b/paper/paper.md @@ -19,10 +19,10 @@ bibliography: paper.bib BoxKit is a library that provides building blocks to parallelize and scale data science, statistical analysis, and machine learning -applications for block-structured datasets. Spatial data from -simulations can be accessed and managed using tools -available in this library when working with Python-based -packages like SciKit, PyTorch, and OpticalFlow. +applications for block-structured simulation datasets. Spatial data +from simulations can be accessed and managed using tools available +in this library to interface with packages like SciKit, PyTorch, and +OpticalFlow for post-processing and analysis. The library provides a Python interface to efficiently access Adaptive Mesh Refinement (AMR) data typical of simulation outputs, and leverages @@ -41,7 +41,7 @@ represent chunks of data stored on disk, acting as array like input for Python functions/methods. This approach allows for selective loading of data from disk to memory in form of chunks/blocks which improves cache efficiency. The library also enables creation of new datasets for data-intensive workflows, and can be extended beyond its current -application to numerical simulations. +application to numerical simulations. ![BoxKit is designed to integrate simulation software instruments like Flash-X with Python-based machine learning and data analysis packages. Large simulation