Skip to content

Disable LMOD cache in test step so we can use modules built in the build step#679

Merged
boegel merged 5 commits intoEESSI:2023.06-software.eessi.iofrom
casparvl:disabling_lmod_cache_in_test_step
Aug 20, 2024
Merged

Disable LMOD cache in test step so we can use modules built in the build step#679
boegel merged 5 commits intoEESSI:2023.06-software.eessi.iofrom
casparvl:disabling_lmod_cache_in_test_step

Conversation

@casparvl
Copy link
Copy Markdown
Collaborator

No description provided.

@eessi-bot
Copy link
Copy Markdown

eessi-bot Bot commented Aug 20, 2024

Instance eessi-bot-mc-aws is configured to build for:

  • architectures: x86_64/generic, x86_64/intel/haswell, x86_64/intel/skylake_avx512, x86_64/amd/zen2, x86_64/amd/zen3, aarch64/generic, aarch64/neoverse_n1, aarch64/neoverse_v1
  • repositories: eessi.io-2023.06-compat, eessi-hpc.org-2023.06-software, eessi-hpc.org-2023.06-compat, eessi.io-2023.06-software

@eessi-bot
Copy link
Copy Markdown

eessi-bot Bot commented Aug 20, 2024

Instance eessi-bot-mc-azure is configured to build for:

  • architectures: x86_64/amd/zen4
  • repositories: eessi.io-2023.06-compat, eessi-hpc.org-2023.06-compat, eessi-hpc.org-2023.06-software, eessi.io-2023.06-software

@casparvl
Copy link
Copy Markdown
Collaborator Author

bot: build repo:eessi.io-2023.06-software arch:x86_64/intel/haswell

@eessi-bot
Copy link
Copy Markdown

eessi-bot Bot commented Aug 20, 2024

Updates by the bot instance eessi-bot-mc-aws (click for details)

@eessi-bot
Copy link
Copy Markdown

eessi-bot Bot commented Aug 20, 2024

Updates by the bot instance eessi-bot-mc-azure (click for details)
  • received bot command build repo:eessi.io-2023.06-software arch:x86_64/intel/haswell from casparvl

    • expanded format: build repository:eessi.io-2023.06-software architecture:x86_64/intel/haswell
  • handling command build repository:eessi.io-2023.06-software architecture:x86_64/intel/haswell resulted in:

    • no jobs were submitted

@eessi-bot
Copy link
Copy Markdown

eessi-bot Bot commented Aug 20, 2024

New job on instance eessi-bot-mc-aws for architecture x86_64-intel-haswell for repository eessi.io-2023.06-software in job dir /project/def-users/SHARED/jobs/2024.08/pr_679/16716

date job status comment
Aug 20 11:37:30 UTC 2024 submitted job id 16716 awaits release by job manager
Aug 20 11:38:04 UTC 2024 released job awaits launch by Slurm scheduler
Aug 20 11:39:06 UTC 2024 running job 16716 is running
Aug 20 11:41:08 UTC 2024 finished
😁 SUCCESS (click triangle for details)
Details
✅ job output file slurm-16716.out
✅ no message matching ERROR:
✅ no message matching FAILED:
✅ no message matching required modules missing:
✅ found message(s) matching No missing installations
✅ found message matching .tar.gz created!
Artefacts
eessi-2023.06-software-linux-x86_64-intel-haswell-1724153983.tar.gzsize: 0 MiB (132577 bytes)
entries: 24
modules under 2023.06/software/linux/x86_64/intel/haswell/modules/all
patchelf/0.17.2-GCCcore-12.2.0.lua
software under 2023.06/software/linux/x86_64/intel/haswell/software
patchelf/0.17.2-GCCcore-12.2.0
other under 2023.06/software/linux/x86_64/intel/haswell
no other files in tarball
Aug 20 11:41:08 UTC 2024 test result
😢 FAILURE (click triangle for details)
Reason
EESSI test suite was not run, test step itself failed to execute.
Details
✅ job output file slurm-16716.out
❌ found message matching ERROR:
✅ no message matching [\s*FAILED\s*].*Ran .* test case

@casparvl
Copy link
Copy Markdown
Collaborator Author

bot: build repo:eessi.io-2023.06-software arch:x86_64/intel/haswell

@eessi-bot
Copy link
Copy Markdown

eessi-bot Bot commented Aug 20, 2024

Updates by the bot instance eessi-bot-mc-aws (click for details)

@eessi-bot
Copy link
Copy Markdown

eessi-bot Bot commented Aug 20, 2024

Updates by the bot instance eessi-bot-mc-azure (click for details)
  • received bot command build repo:eessi.io-2023.06-software arch:x86_64/intel/haswell from casparvl

    • expanded format: build repository:eessi.io-2023.06-software architecture:x86_64/intel/haswell
  • handling command build repository:eessi.io-2023.06-software architecture:x86_64/intel/haswell resulted in:

    • no jobs were submitted

@eessi-bot
Copy link
Copy Markdown

eessi-bot Bot commented Aug 20, 2024

New job on instance eessi-bot-mc-aws for architecture x86_64-intel-haswell for repository eessi.io-2023.06-software in job dir /project/def-users/SHARED/jobs/2024.08/pr_679/16717

date job status comment
Aug 20 14:07:07 UTC 2024 submitted job id 16717 awaits release by job manager
Aug 20 14:07:22 UTC 2024 released job awaits launch by Slurm scheduler
Aug 20 14:14:25 UTC 2024 running job 16717 is running
Aug 20 14:16:27 UTC 2024 finished
😁 SUCCESS (click triangle for details)
Details
✅ job output file slurm-16717.out
✅ no message matching ERROR:
✅ no message matching FAILED:
✅ no message matching required modules missing:
✅ found message(s) matching No missing installations
✅ found message matching .tar.gz created!
Artefacts
eessi-2023.06-software-linux-x86_64-intel-haswell-1724163304.tar.gzsize: 0 MiB (131489 bytes)
entries: 24
modules under 2023.06/software/linux/x86_64/intel/haswell/modules/all
patchelf/0.17.2-GCCcore-12.2.0.lua
software under 2023.06/software/linux/x86_64/intel/haswell/software
patchelf/0.17.2-GCCcore-12.2.0
other under 2023.06/software/linux/x86_64/intel/haswell
no other files in tarball
Aug 20 14:16:27 UTC 2024 test result
😢 FAILURE (click triangle for details)
Reason
EESSI test suite was not run, test step itself failed to execute.
Details
✅ job output file slurm-16717.out
❌ found message matching ERROR:
✅ no message matching [\s*FAILED\s*].*Ran .* test case

@casparvl casparvl changed the title Show effect of disableing LMOD cache Disable LMOD cache in test step so we can use newly built modules Aug 20, 2024
@casparvl casparvl changed the title Disable LMOD cache in test step so we can use newly built modules Disable LMOD cache in test step so we can use modules built in the build step Aug 20, 2024
@boegel
Copy link
Copy Markdown
Contributor

boegel commented Aug 20, 2024

@casparvl Setting $LMOD_IGNORE_CACHE makes sense to me, since the Lmod cache is only updated on ingestion.

@casparvl
Copy link
Copy Markdown
Collaborator Author

I know, right? ;-) The only thing that surprises me is that we didn't do this sooner :D

@casparvl
Copy link
Copy Markdown
Collaborator Author

Output is completely as expected:

  • when using LMOD cache, patchelf/0.17.2-GCCcore-12.2.0 that was built in the build step, can't be found by module av nor loaded by module load. The patchelf command available then is from the compat layer.
  • when not using LMOD cache, patchelf/0.17.2-GCCcore-12.2.0 that was built in the build step is found by both module av and loaded by module load. The patchelf command available then is from the module that was just built.
Trying to load modules while LMOD cache is used

---- /cvmfs/software.eessi.io/versions/2023.06/software/linux/x86_64/intel/haswell/modules/all ----
   patchelf/0.18.0-GCCcore-12.3.0    patchelf/0.18.0-GCCcore-13.2.0 (D)

  Where:
   D:  Default Module

If the avail list is too long consider trying:

"module --default avail" or "ml -d av" to just list the default modules.
"module overview" or "ml ov" to display the number of modules for each name.

Use "module spider" to find all possible modules and extensions.
Use "module keyword key1 key2 ..." to search for all possible modules matching
any of the "keys".


Lmod has detected the following error: The following module(s) are unknown:
"patchelf/0.17.2-GCCcore-12.2.0"

Please check the spelling or version number. Also try "module spider ..."
It is also possible your cache file is out-of-date; it may help to try:
  $ module --ignore_cache load "patchelf/0.17.2-GCCcore-12.2.0"

Also make sure that all modulefiles written in TCL start with the string
#%Module



/cvmfs/software.eessi.io/versions/2023.06/compat/linux/x86_64/usr/bin/patchelf
Trying to load modules while LMOD cache is not used

---- /cvmfs/software.eessi.io/versions/2023.06/software/linux/x86_64/intel/haswell/modules/all ----
   patchelf/0.17.2-GCCcore-12.2.0    patchelf/0.18.0-GCCcore-13.2.0 (D)
   patchelf/0.18.0-GCCcore-12.3.0

  Where:
   D:  Default Module

If the avail list is too long consider trying:

"module --default avail" or "ml -d av" to just list the default modules.
"module overview" or "ml ov" to display the number of modules for each name.

Use "module spider" to find all possible modules and extensions.
Use "module keyword key1 key2 ..." to search for all possible modules matching
any of the "keys".


/cvmfs/software.eessi.io/versions/2023.06/software/linux/x86_64/intel/haswell/software/patchelf/0.17.2-GCCcore-12.2.0/bin/patchelf

Now that I've demonstrated this, let me strip out the debugging output and only keep the environment variable that makes sure the LMOD cache isn't used :)

@casparvl casparvl marked this pull request as ready for review August 20, 2024 18:06
@boegel boegel merged commit 728eb66 into EESSI:2023.06-software.eessi.io Aug 20, 2024
@boegel boegel added the 2023.06-software.eessi.io 2023.06 version of software.eessi.io label Aug 20, 2024
@casparvl casparvl deleted the disabling_lmod_cache_in_test_step branch August 21, 2024 08:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

2023.06-software.eessi.io 2023.06 version of software.eessi.io

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants