Skip to content

elfarouk/Niva_Project

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

235 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Niva Project

Field delineation using the Resunet-a model.

About The Project

This projects is based on the sentine-hub field delineation project.

Getting Started

This section will guide you on setting up the project and executing the full deep learning training and inference pipeline

Prerequisites

Python

This project uses python 3.10, that can be installed alongside other versions.

  • Add python repository

    sudo add-apt-repository ppa:deadsnakes/ppa
    sudo apt update
  • Install python 3.10

    sudo apt install python3.10 python3.10-venv python3.10-dev
    python3.10 --version

You may also want to install it from source by downloading the tarball from the official python website, extracting it and following instructions.

GEOS

The geos library is required by python modules used.

PostgreSQL (INFERENCE ONLY)

libpq is the C application programmer's interface to PostgreSQL. It is required to install the psycopg2 package

To use profiling tools

The project comes with a script allowing the use of the Darshan I/O profiler or the Nsight system performance analysis tool during training and data preprocessing. Both should be installed to generate traces.

Both Darshan and Nsight comes bundled with tools separated into host and target sections:

  • target tools should be installed on the compute node executing the training program and is responsible for generating raw traces that are not directly readable.
  • host tools should be installed on your personnal workstation (with a screen) and are used to visualize and analyze tracing results.

Thus, if you work in a HPC environment, Darshan and Nsight should be installed on the system you use for computations and on the one you use for analysis.

Installation

1. Clone the repository

git clone https://github.com/Jodp1905/Niva_Project.git

2. Configure the project

You can modify some parameters used in the project in the config.yaml file under /config.

YAML config file values can also be overriden by environment variables if set, see ENV_CONFIG.md under /config for a description of all parameters.

Upon downloading the project, all parameters will first be set to their default values.

⚠️ Warning: niva_project_data_root must be set for any part of the project to run. All data used for training, the trained models, and the inference results will be written in this directory.

Set it in the yaml file where it is null by default, or export it as an environment variable:

mkdir /path/to/project_data/
export NIVA_PROJECT_DATA_ROOT=/path/to/project_data/

You may also add the line to your ~/.bashrc file for convenience.

3. Install required python packages

You can download the necessary python packages by using the requirements files at the root of the project:

  • requirements_training.txt contains necessary packages for running the training
  • requirements_inference.txt contains necessary packages for running the inference pipeline

They are separated to allow for more flexibility in the install process as the inference pipeline requires a PostgreSQL install.

The use of a virtual environment is advised:

  • Create and activate virtual environment

    python3.10 -m venv /path/to/niva-venv
    source /path/to/niva-venv/bin/activate
  • Install from requirements.txt in the venv

    pip install -r requirements.txt

Usage

This sections provides an overview on how you can get started and run the full field delineation training pipeline from end to end. Bash scripts are made available under /scripts to facilitate executions, but you may also use the python scripts under /src directly.

1. Download dataset

To download the ai4boundaries dataset from the Joint Research Centre Data Catalogue ftp servers, use the download_dataset.sh script:

cd ./scripts
./download_dataset.sh

Data is downloaded at the location specified by niva_project_data_root under /sentinel2 and split into 3 folders corresponding to the training configurations: training/validation/testing. Requires an internet connection.

You should have this structure after download :

niva_project_data_root
└── sentinel2
    ├── ai4boundaries_ftp_urls_sentinel2_split.csv
    ├── test
    │   ├── images        # 2164 files
    │   └── masks         # 2164 files
    ├── train
    │   ├── images        # 5236 files
    │   └── masks         # 5236 files
    └── val
        ├── images        # 431 files
        └── masks         # 432 files

2. Preprocessing pipeline

The downloaded dataset can now be preprocessed using the run_preprocessing.sh script:

cd ./scripts
./run_preprocessing.sh

The preprocessing pipeline create multiple folders under niva_project_data_root corresponding to its different executed steps while keeping the test/train/val structure :

niva_project_data_root
├── datasets
│   ├── test
│   ├── train
│   └── val
├── eopatches
│   ├── test
│   ├── train
│   └── val
├── npz_files
│   ├── test
│   ├── train
│   └── val
├── patchlets_dataframe.csv
└── sentinel2

3. Training

Once the datasets are created, you can run the model training using the training.sh script :

cd ./scripts
./run_training.sh

After training, the resulting model will be saved under /model in niva_project_data_root as "training_$date" by default:

niva_project_data_root
├── datasets
├── eopatches
├── sentinel2
├── npz_files
├── patchlets_dataframe.csv
└── models
    ├── training_20240922_160621
    └── training_20240910_031256

You may also directly execute the python script and set a name of your choice :

cd ./src/training
python3 training.py <training-name>

Once the training has been executed, you can use the training name and the main_analyze script under /utils to generate loss/accuracy plots as well as a textual description of hyperparameters, memory usage or model size:

cd ./src/training
python3 main_analyze.py <training-name>

4. Tracing

Generating traces

You will first have to set the following parameters in tracing_wrapper.sh under /script:

  • PYTHON_VENV_PATH: The path to the root folder of your python environment used to run training.
  • DARSHAN_LIBPATH: Path to libdarshan.so, should be under path/to/darshan-runtime-install/lib/.
  • NSIGHT_LOGDIR: set tracefile output directory for nsight traces.

Note: The Darshan trace files will be saved under the path set upon installing darshan-runtime, use darshan-config --log-path to view it.

  • DARSHAN_DXT: Set to 1 to enable Darshan eXtended Tracing, which generates more detailed I/O traces including a per file summary of all I/O operations.
  • NSIGHT_NVTX: Training specific option. Set to 1 to enable profiling of samples of training batches instead of the entire training execution. Note that nsight_batch_profiling also needs to be enabled in yaml configuration file.
  • NSIGHT_STORAGE_METRICS: Set to 1 to enable using the storage_metrics nsight plugin, which should be included in you nsight distribution.
  • LUSTRE_LLITE_DIR: Used with NSIGHT_STORAGE_METRICS=1, path to the lustre LLITE directory for capturing Lustre I/O metrics.
  • NSIGHT_PYTHON_SAMPLING: Set to 1 to enable nsight collection of Python backtrace sampling events. Overhead can be high

Once set, use the script to generate darshan or nsight reports of the training process:

./tracing_wrapper.sh <darshan/nsight> ./run_training.sh 

Note that, as a wrapper, it can be user with any command/binary/script. You can therefore use it directly with the python interpreter:

./tracing_wrapper.sh <darshan/nsight> python3 ./../src/training/training.py

Visualizing traces

Once the wrapped program has terminated, darshan (.darshan) or Nsight (.nsys-rep) files are created under the default log path for darshan or the one you set in the script parameters for Nsight. You may have to download them on your host machine to use the visualization tools.

You can use the following:

  • Darshan: You can firstly use the darshan python package to create a job summary with heatmaps and I/O metrics for all enable darshan modules:

    python3 -m darshan summary ./my_darshan_trace_file.darshan

    This will result in the creation of an html file, that you can view on a local live server.

    If you enabled DxT traces, you can use the dxt-parser command from darshan-utils to create a detailled report of timestamped I/O accesses for all files:

    darshan-parser ./my_darshan_trace_file.darshan
  • Nsight:

    Visualizing Nsight traces amounts to using the nsys-ui command:

    nsys-ui ./my_nsys_trace_file.nsys-rep

Implementation details

Data preprocessing implementation

Preprocessing workflow diagram is available under /visuals :

Preprocessing Workflow

Inference implementation

TODO add inference diagram

About

Niva-Meluxina

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Jupyter Notebook 98.7%
  • Python 1.2%
  • Shell 0.1%