Skip to content

temporal-hpc/ORCS

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

229 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ORCS

ORCS stands for "Optimized Ray-tracing Core Simulation" for Fixed-Radius Nearest Neighbors (FRNN) interactions in 3D space. This repository contains the source code and data sets that we have made available to the community as part of our published research "Advancing RT Core-Accelerated Fixed-Radius Nearest Neighbor Search"

ArXiv article "Advancing RT Core-Accelerated Fixed-Radius Nearest Neighbor Search"

https://arxiv.org/abs/2601.15633

Prerequisites

  • CMake ≥ 3.20
  • A C++17-capable host compiler and NVCC (CUDA Toolkit 11.x or newer recommended)
  • NVIDIA GPU with RTX hardware support and a recent driver exposing libnvidia-ml.so
  • NVIDIA OptiX 7 SDK (OPTIX_HOME should point to the SDK root)
  • OpenMP-capable host toolchain (for CPU solver parallelism)

Optional but recommended for plotting data:

  • Python 3 with matplotlib/pandas for plotting utilities in plots/
  • clang-format for maintaining code style (clang-format configuration included)

Building ORCS

mkdir -p build
cd build
cmake .. -DCMAKE_BUILD_TYPE=<Release|Debug> -DOPTIX_HOME=/opt/optix-sdk
make -j

Notes:

  • OPTIX_HOME must reference the OptiX 7 SDK installation root so that headers and libraries can be located.
  • The build system auto-detects the local GPU SM architecture via nvidia-smi; override with -DCMAKE_CUDA_ARCHITECTURES=86 (or similar) for reproducible deployments.
  • NVML is required at link time. Ensure your NVIDIA driver install provides libnvidia-ml.so (usually in /usr/lib/x86_64-linux-gnu/).

Key CMake Options

All options can be toggled with -D<OPTION>=ON/OFF during configuration.

Option Default Description
USE_DOUBLE_PRECISION ON Switches real typedefs to double. Disable for faster single-precision runs.
USE_PERIODIC_BOUNDARY OFF Compiles with periodic boundary conditions as the default. Runtime flag --border_type can still override per run.
LOGRTXTIME OFF Prints raw iteration timings from the RTX solver instead of the formatted summary.
MEASURE_POWER OFF Enables NVML/CPU power monitors. Requires NVML support and sufficient privileges.
LOG_INTERACTIONS OFF Persists per-iteration interaction counts to interaction_stats.csv.
DELTA_TIME 0.001 Compile-time constant injected into solvers (useful for physics tuning).
CPU_NATIVE OFF Adds -march=native (or equivalent) to host compilation for maximum CPU performance.
RTXNEIGHBORS_OPTIX_ARCH empty Overrides the OptiX PTX -arch (e.g. compute_90). Useful when targeting different GPUs than the build host.

Re-configure with new flags to rebuild, e.g.:

cmake .. -DCMAKE_BUILD_TYPE=<Release|Debug>  -DOPTIX_HOME=/opt/optix-sdk -DCPU_NATIVE=ON -DUSE_DOUBLE_PRECISION=OFF
make -j

Running Simulations

From the build directory:

./ORCS <method> <n_particles> <max_neighbors> [options]

Solver Methods

ID Solver Description
0 CPU Baseline CPU neighbor search with OpenMP parallelism.
1 GRID Uniform grid acceleration structure on the GPU.
2 RTX OptiX-based neighbor search.
3 RTX Physics RTX solver with physics computed in OptiX shaders.
4 RTX Payload Variant using OptiX payload for neighbor data transport.
5 NaiveGPU Reference CUDA implementation (no spatial acceleration).
6 GridV2 Experimental grid-based GPU solver.

Common Options

Flag Default Purpose
-c, --cutoff_radius <factor> 2.5 Multiplier applied to particle radii when searching neighbors.
-p, --positions <g|u|n> g Particle placement: grid, uniform random, or normal mixture.
--pp1 <value> 1.0 Extra parameter for position generator (stddev for normal, jitter scale otherwise).
--pp2 <value> 1 Additional shape parameter (e.g. number of normal "bells").
-r, --radius_distribution <u|n|l> u Radius distribution (uniform, normal, lognormal).
--rmin/--rmax <value> 100 Minimum and maximum particle radii before applying the cutoff factor.
--pr1/--pr2 <value> 1.0 Distribution-specific radius parameters (mean/stddev).
-i, --iterations <count> 10 Number of benchmark iterations to execute.
-s, --seed <int> 0 RNG seed for reproducible particle sets.
-o, --output_file <path> empty Writes final positions/velocities to the specified file.
-v, --verbose <0|1|2> 0 Runtime logging level (requires DEBUG build for detailed prints).
-m, --use_morton off Enables Morton ordering before neighbor search.
-x, --bvh_rebuild_scheme <0-4> 0 Selects BVH rebuild strategy (see below).
-w, --window <int> 10 Sliding window size for rebuild heuristics.
--border_type <0|1> 0 0 reflective walls, 1 periodic domain.
-t, --nthreads <int> 1 OpenMP thread count for CPU solver.
-d, --dev <id> 0 CUDA device index to target.

BVH rebuild schemes (-x): 0=FIXED, 1=BASIC, 2=TOTAL_AVG, 3=LAST_AVG, 4=DERIVATIVE (mapped to rtxneighbors::OptimizerType).

Example

./ORCS 2 500000 128 -c 3.0 -p u --rmin 50 --rmax 120 -i 20 -m -x 2 --border_type 1 -d 0

Runs the RTX solver on 500k particles with a uniform distribution, periodic boundaries, and the running-average BVH rebuild heuristic.

Repository Layout

  • src/ core application, solvers, CUDA/OptiX kernels, and benchmarking harness.
  • include/ third-party single-header dependencies (e.g., CLI11.hpp).
  • CMake/ custom find modules (FindOptiX7.cmake, OptiX IR utilities).
  • optixir/ generated OptiX IR/ptx blobs (populated at build time).
  • plots/, results/, scripts/ supporting analysis and automation.
  • build/ CMake build tree (ignored by version control; shown here for reference).

License

ORCS is distributed under the terms of the repository's LICENSE file.

About

Source code of the implementation for Optimized Ray Tracing Core Simulation on FRNN interactions

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors