-
Notifications
You must be signed in to change notification settings - Fork 22
Closed
Labels
bugSomething isn't workingSomething isn't working
Description
Describe the issue:
I am running a slightly modified version of the VSI test (in 3D, different resolution, lower scale height, dumps+outputs every 10 orbits). The outputs are written fine according to the log file:
Vtk: Write file data.0002.vtk...done in 3.206488e+00 s.
Dump: Write file n 2...done in 1.527347e+00 s.
However this is the output directory listing, note the very small sizes of the vtk and dmp files after the first output.
6,6G 8. Sep 13:57 data.0000.vtk
8,0K 8. Sep 18:56 data.0001.vtk
8,0K 9. Sep 00:06 data.0002.vtk
7,6G 8. Sep 13:57 dump.0000.dmp
56K 8. Sep 18:56 dump.0001.dmp
56K 9. Sep 00:06 dump.0002.dmp
I checked the dump file, and it seems it wrote the header and coordinates fine, but fails when reading the data of the first field Vc-RHO.
Is this something you have seen before or an issue with the file system?
Error message:
No response
runtime information:
This is how the code is built in the slurm script:
module load spack/2024.04
module load cmake/3.20.2-gcc-11.4.1 cuda/11.8.0
module load openmpi/5.0.0-gcc-11.4.1-cuda11.8
# this is for an H100
cmake $IDEFIX_DIR -DKokkos_ENABLE_CUDA=ON -DKokkos_ARCH_HOPPER90=ON
make -j 2
Dump file header is Idefix 2.1.01-2f15373c Dump Data little endian.
Below I add the beginning of the log file:
-- Setting default Kokkos CXX standard to 17
-- Kokkos version: 4.3.1
-- The project name is: Kokkos
-- Using internal gtest for testing
-- Compiler Version: 11.8.89
-- kokkos_launch_compiler (/idefix/src/kokkos/bin/kokkos_launch_compiler) is enabled...
-- Using -std=c++17 for C++17 standard as feature
-- Built-in Execution Spaces:
-- Device Parallel: Kokkos::Cuda
-- Host Parallel: NoTypeDefined
-- Host Serial: SERIAL
--
-- Architectures:
-- HOPPER90
-- Using internal desul_atomics copy
-- Kokkos Backends: SERIAL;CUDA
-- Idefix final configuration
-- MHD: OFF
-- MPI: OFF
-- HDF5: OFF
-- Reconstruction: Linear
-- Precision: Double
-- Version: 2.1.01-2f15373c
-- Problem definitions: 'definitions.hpp'
-- Configuring done (1.0s)
-- Generating done (0.4s)
-- Build files have been written to: /idefix-setups/idefix_VSI
[ 3%] Built target kokkossimd
[ 3%] Built target AlwaysCheckGit
[ 6%] Built target impl_git_version
[ 40%] Built target kokkoscore
[ 43%] Built target kokkoscontainers
[ 46%] Building CXX object CMakeFiles/idefix.dir/src/output/dump.cpp.o
[ 46%] Building CXX object CMakeFiles/idefix.dir/src/dataBlock/dumpToFile.cpp.o
[ 47%] Building CXX object CMakeFiles/idefix.dir/src/output/vtk.cpp.o
[ 49%] Building CXX object CMakeFiles/idefix.dir/src/input.cpp.o
[ 50%] Linking CXX executable idefix
[100%] Built target idefix
Starting job
I'm on Host [...].physik.uni-muenchen.de
It's now So 8. Sep 13:56:51 CEST 2024
.:HMMMMHn:. ..:n..
.H*'`` `'%HM'''''!x.
:x x*` .(MH: `#h.
x.`M M> :nMMMMMMMh. `n.
*kXk.. XL nnx:.XMMMMMMMMMMML .. 4X.
)MMMMMx 'M `^?M*MMMMMMMMMMMM:HMMMHHMM.
MMMMMMMX ?k 'X ..'*MMMMMMM.#MMMMMMMMMx
XMMMMMMMX 4: M:MhHxxHHHx`MMx`MMMMMMMMM>
XM!` ?M `x 4MM'`''``HHhMMX 'MMMMMMMM
4M M `: *> `` .('MX '*MMMM'
MX `X.nnx.. ..XMx` 'M*X
?h. ''```^'*!Hx. :Mf xHMh M**MMM 4L`
`*Mx `'*n.x. 4M> :M` `` 'M ` %
'% ``*MHMX X> !
:! `#MM> X> ` :x
:M ?M `X . ..'M
XX .!*X `x XM( MMx`h
'M>:: `M: `+ MMX XMM `:
'M> M 'X 'MMX ?MMk.Xx..
'M> ?L ...:! MMX.H**'MMMM*h
M> #L :!'`MM. . X*`.xHMMMMMnMk.
`! #h. :L XM'*hxHMM*MhHMMMMMMMMMM'#h
+ XMh: 4! x :f MM' `*MMMMMMMMMM% `X
M Mf``tHhxHM M> 4k xxX' `#MMMMMMMf `M .>
:f M `MMMMM: M> M!MMM: '*MMf' 'MH*
! Xf 'MMMMMX `X X>'h.` :P*Mx. .d*~..
:M X 4MMMMM> ! X~ `Mh. .nHL..M#'%nnMhH!'`
XM d> 'X`'**h 'h M ^'MMHH+*'` '''' `'**'
%nxM> *x+x.:. XL.. `k `::X
:nMMHMMM:. X> Mn`*MMMMMHM: `: ?MMn.
`'**MML M> 'MMhMMMMMMMM # `M:^*x
^*MMttnnMMMMMMMMMMMH>. M:.4X
`MMMM>X ( .MMM:MM! .
`'''4x.dX +^ `''MMMMHM?L..
``' `'`'`'`
Idefix version 2.1.01-2f15373c
Built against Kokkos 40301
Compiled on Sep 8 2024 at 13:56:42
Main: initialization stage.
Main: initialisation finished.
Main: running on [...].physik.uni-muenchen.de
-----------------------------------------------------------------------------
Input Parameters using input file idefix.ini:
-----------------------------------------------------------------------------
[Boundary]
X1-beg userdef
X1-end outflow
X2-beg outflow
X2-end outflow
X3-beg periodic
X3-end periodic
[Gravity]
Mcentral 1.0
gravCst 1
potential central
skip 1
[Grid]
X1-grid 1 1.0 1280 l 3.0
X2-grid 1 1.2707963267948965 384 u 1.8707963267948966
X3-grid 1 0.0 512 u 1.5707963267948966
[Hydro]
csiso userdef
solver hllc
[Output]
dmp 62.831853071795865
dmp_dir /idefix_VSI3D
log 100
vtk 62.831853071795865
vtk_dir /idefix_VSI3D
[Setup]
epsilon 0.05
[TimeIntegrator]
CFL 0.8
CFL_max_var 1.1
check_nan 100
first_dt 1.e-3
max_runtime -1
maxdivB 1e-06
nstages 2
tstop 1256.6370614359
-----------------------------------------------------------------------------
-----------------------------------------------------------------------------
Input: Kokkos configuration
Device Execution Space:
KOKKOS_ENABLE_CUDA: yes
Cuda Options:
KOKKOS_ENABLE_CUDA_LAMBDA: yes
KOKKOS_ENABLE_CUDA_LDG_INTRINSIC: yes
KOKKOS_ENABLE_CUDA_RELOCATABLE_DEVICE_CODE: no
KOKKOS_ENABLE_CUDA_UVM: no
KOKKOS_ENABLE_CXX11_DISPATCH_LAMBDA: yes
KOKKOS_ENABLE_IMPL_CUDA_MALLOC_ASYNC: yes
Cuda Runtime Configuration:
macro KOKKOS_ENABLE_CUDA : defined
macro CUDA_VERSION = 11080 = version 11.8
Kokkos::Cuda[ 0 ] NVIDIA H100 NVL capability 9.0, Total Global Memory: 93.12 G, Shared Memory per Block: 48 K : Selected
-----------------------------------------------------------------------------
Input: Compiled with DOUBLE PRECISION arithmetic.
Input: DIMENSIONS=3.
Input: COMPONENTS=3.
Grid: full grid size is
Direction X1: userdef 1....1280....3 outflow
Direction X2: outflow 1.2708....384....1.8708 outflow
Direction X3: periodic 0....512....1.5708 periodic
Hydro: solving HD equations.
Hydro: Reconstruction: 2nd order (PLM Van Leer)
EquationOfState: isothermal with user-defined cs function.
RiemannSolver: hllc (HD).
Gravity: ENABLED.
Gravity: G=1.
Gravity: central mass gravitational potential ENABLED with M=1
TimeIntegrator: using 2nd Order (RK2) integrator.
TimeIntegrator: Using adaptive dt with CFL=0.8 .
Main: Creating initial conditions.
Vtk: Write file data.0000.vtk...done in 9.42678 s.
Dump: Write file n 0...done in 8.82041 s.
Main: Cycling Time Integrator...
TimeIntegrator: time | cycle | time step | cell (updates/s)
TimeIntegrator: 0.000000e+00 | 0 | 1.000000e-03 | N/A
TimeIntegrator: 1.686485e-01 | 100 | 1.716926e-03 | 7.347258e+08
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working