Conversation
Main changes:
- new version 2022.11
- using trz42/gentoo-overlay until new sets have been added to EESSI/gentoo-overlay
- using a recent commit to gentoo/gentoo.git plus adding some comments about history of used commits
- setting a new version of the default gcc to 10.4.0 (from 9.4.0)
- masking sys-devel/gcc greater or equal to 10.4.1 (avoiding issue that newer package is installed in boostrap stages 2/3 and then the default gcc cannot be set with gcc-config)
- adding clearance of LD_LIBRARY_PATH variable to singularity command (if not the bootstrap-prefix.sh script fails if LD_LIBRARY_PATH was set)
- change `become` to true for task changing ownership of installed files
- using recent bootstrap-prefix.sh script from gentoo/prefix.git with three changes from EESSI (setting og theshell in bootstrap_startscript, using https for setting DISFILES_{G_O,PFX}, setting ans to no in line 2839 {question about stable packages keywords})
Changes by commit c43a5bf0d00d3cb2a0452c35b5c42bf4fc4cfc9f to gentoo/prefix/scripts/bootstrap-prefix.sh
You know you don't need to stick to that after installing, right? You can and should update to a newer version of GCC if you want. Only the bootstrap process is a bit finicky, but once you bootstrap, you are usually free from the constraints. You can also install multiple versions of GCC (i.e. GCC 12 for CPU, GCC 11 for CUDA), which you can use later as |
Good to know. I think for EESSI we only need one version that is then used to build up tools chains (incl compilers) in the software layer. In the past we ran into issues with too recent versions of GCC in that step. Hence, we'd rather like to restrict ourselves to the oldest version of GCC in the compat layer. Might change in the future when we move to more recent tool chains in the software layer. |
|
I see. I suggest to try using the toolchain from the compat layer to build the software layer, to avoid having to add ld wrappers, etc, since the toolchain is already configured to work well with the non-standard glibc and linker. It will also let you control binutils with |
| - name: eessi | ||
| source: git | ||
| url: https://github.com/EESSI/gentoo-overlay.git | ||
| url: https://github.com/trz42/gentoo-overlay.git |
There was a problem hiding this comment.
Just pointing this out, so we don't forget to revert this back to EESSI after EESSI/gentoo-overlay#84 is merged...
There was a problem hiding this comment.
@trz42 How do you deal with the branch aspect? Or did you just merge the eessi-2022.11 branch you used for EESSI/gentoo-overlay#84 into your main branch?
There was a problem hiding this comment.
To answer my own question: yes, you've just updated the main branch in your fork with EESSI/gentoo-overlay#84, OK.
boegel
left a comment
There was a problem hiding this comment.
It looks like the install_prefix.yml task will also need a change to fix ansible-lint check?
Jinja templates should only be at the end of 'name'
| # stick to GCC 9.x; using a too recent compiler in the compat layer complicates stuff in the software layer, | ||
| # see for example https://github.com/EESSI/software-layer/issues/151 | ||
| >=sys-devel/gcc-10 | ||
| >=sys-devel/gcc-10.4.1 |
There was a problem hiding this comment.
Comment above should be updated accordingly?
But maybe keep the pointer to EESSI/software-layer#151 since that can help explain why we prefer sticking to an older GCC version.
@trz42 Did you try to build GCC/9.3.0 on top of this compat layer with GCC 10.x as system compiler?
There was a problem hiding this comment.
Good catch about the comment. We should keep the pointer yes.
Yes, I've built GCC/9.3.0 and also then software. See trz42/software-layer#42
| # trying to set 10.4.1_p20221006 fails | ||
| # we mask sys-devel/gcc below to not install anything newer than 10.4.0 | ||
| # gentoo_git_commit: c2d8ce0e1b6206a225a9f2547bbc65c79218756c | ||
| # 2022.11 (Nov 3 2022) second iteration made for PR |
There was a problem hiding this comment.
We can prune the comments. For me they add some context right now while we got going again. If we update more frequently or even build compat layers with a bot, they might not provide much useful context or there are other ways to document the history.
There was a problem hiding this comment.
GCC ebuilds with _p in the version are versions of GCC with extra patches by Gentoo, usually to solve bugs, so you can consider them similar to the other versions. For example sys-devel/gcc-13.0.0_pre20221030 was added to fix https://bugs.gentoo.org/879049. Also, the only reason that GCC 10.x is used for bootstrapping is because GCC 11 requires C++11 to build, and that itself requires a new enough compiler that may not be available on old systems. You don't have to stick to older versions of GCC in EESSI. You can just leave the masks free and choose GCC by adding a slotted version to your set, like sys-devel/gcc:11 if you want GCC 11.x but don't care which minor version. It's quite safe to keep it updated within the same major version (i.e. that won't break packages built on top of the compat layer).
| local: "{{ playbook_dir }}/../../bootstrap-prefix.sh" | ||
| remote: /tmp/bootstrap-prefix.sh | ||
| prefix_singularity_command: "singularity exec -B {{ gentoo_prefix_path }}:{{ gentoo_prefix_path }}" | ||
| prefix_singularity_command: "singularity exec --env LD_LIBRARY_PATH= -B {{ gentoo_prefix_path }}:{{ gentoo_prefix_path }}" |
There was a problem hiding this comment.
@trz42 We should add a comment above this line to clarify why LD_LIBRARY_PATH is explicitly set to empty?
There was a problem hiding this comment.
Because otherwise the bootstrap script simply stops running. It's explained in the PR.
| path: "{{ gentoo_prefix_path }}" | ||
| recurse: true | ||
| become: false | ||
| become: true |
There was a problem hiding this comment.
Also here we should clarify why true (or false is needed, this become bit is a bit cryptic I think...
There was a problem hiding this comment.
If the ansible script is run as a normal user, it cannot change ownership of files it doesn't own. That's the purpose of the task ... to change ownership to the user running the task.
|
|
||
| - include_tasks: install_prefix.yml | ||
| - name: Include task install_prefix.yml | ||
| ansible.builtin.include_tasks: install_prefix.yml |
There was a problem hiding this comment.
I guess this was changed for a similar reason as community.general.portage above?
There was a problem hiding this comment.
That was changed to make the sensible-linter happy.
|
I'd rather reconfigure the ansible-lint check to ignore the Jinja template issue. The script has run fine a dozen of times. The "ERROR" only states that templates should be at the end, so it doesn't matter. Also curious why we need to fix this here. Maybe it is better to do a code quality hackathon than imposing unrelated code improvements to PRs. |
|
Probably interesting finds on building the compat layer:
|
|
This PR has become obsolete with new developments in early 2023. |
This includes a number of updates to the current version (2021.12):
File
bootscript-prefix.sh:bootstrap-prefix.shhas been sync'ed with updates to the upstream version (https://github.com/gentoo/prefix.git) until a recent update on Nov 2 2022. The script here still includes a few changes we may want to revisit i.e. if they are necessary or if they can be removed to allow us to just use the upstream script.diff gentoo-prefix/scripts/bootstrap-prefix.sh EESSI-compatibility-layer/bootstrap-prefix.sh):File
ansible/playbooks/roles/compatibility_layer/defaults/main.yml:2022.11. (eessi_version: "2022.11")custom_overlays.url: https://github.com/trz42/gentoo-overlay.git)gentoo_git_commit: cec3214ef5d5661e28c9d2c5b5750b27c27c5435) being used is updated to a more recent version (from Nov 3 2022). A bit information about history of used commits was added. Can be removed if deemed unnecessary.gccversion has been increased to10.4.0(seeprefix_default_gcc: 10.4.0). (Requirement in upstreambootstrap-prefix.shscript.)gccbeing used had to be restricted to anything before10.4.1because some10.4.1_p*ebuilds were added to https://github.com/gentoo/gentoo/sys-devel/gcc recently. While those installed fine during the bootstrap stages 2/3, it was not immediately clear how to set them as default withgcc-config. Simply setting them viaprefix_default_gcc(see item on default gcc) didn't work. (see line>=sys-devel/gcc-10.4.1forprefix_mask_packages:)prefix_singularity_commandwere changed to unsetLD_LIBRARY_PATH(thebootstrap-prefix.shin2021.12had a slight spelling typo in the check ifLD_LIBRARY_PATHis set, hence went on even if it was set). The updatedbootstrap-prefix.shchecks for the correct variable.File
ansible/playbooks/roles/compatibility_layer/tasks/install_packages.yml:rootpermissions. (Could be an issue with testing setup. Might be good to verify if it is needed in other environments.)