feat: Add Apptainer container definition for reproducible AnnotationGx environment#82
feat: Add Apptainer container definition for reproducible AnnotationGx environment#82sisiranair wants to merge 1 commit intomainfrom
Conversation
📝 WalkthroughWalkthroughThis PR introduces Apptainer container infrastructure for AnnotationGx, consisting of a definition file specifying the container image (based on rocker/r-ver:4.3.2 with AnnotationGx dependencies), a build script, and documentation explaining the container's purpose and usage. Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Tip Try Coding Plans. Let us write the prompt for your AI agent so you can ship faster (with fewer bugs). Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@containers/apptainer/annotationgx.def`:
- Line 2: The base image pinned as "rocker/r-ver:4.3.2" does not satisfy the
package DESCRIPTION which requires R (>= 4.5.0); update the FROM image tag to a
rocker image with R 4.5.0 or newer (e.g., change "rocker/r-ver:4.3.2" to
"rocker/r-ver:4.5.0" or a later patch release), rebuild the container, and
verify R version (R --version) and package installation succeed; ensure any
image-specific tweaks in the container remain compatible after the upgrade.
- Around line 11-12: The Dockerfile currently calls
BiocManager::install('AnnotationGx') without pinning a Bioconductor release (and
the comment references a non-standard SYS env approach), which makes builds
non-reproducible; update the R install step that invokes BiocManager::install
(the R -e lines calling BiocManager::install(...)) to either: a) hardcode the
compatible Bioconductor release for R 4.3.2 by calling
BiocManager::install(version = "3.XX") and then install AnnotationGx with ask =
FALSE and update = FALSE, or b) switch the container base to a pinned
bioconductor/bioconductor_docker:RELEASE_3_XX image so you can keep the
BiocManager::install('AnnotationGx') call but rely on a release-matched base;
ensure the chosen approach replaces the unpinned BiocManager::install call and
includes ask = FALSE, update = FALSE to avoid interactive prompts.
In `@containers/apptainer/build.sh`:
- Around line 4-7: The build script hardcodes sudo, uses relative paths, and
leaves variables unquoted; update it to enable strict mode (set -euo pipefail),
resolve DEF_FILE and IMAGE_NAME to absolute paths using the script directory
(e.g., derive DIR from "${BASH_SOURCE[0]}"), replace the literal sudo apptainer
call with a conditional that uses sudo only if needed/available (e.g., check if
running as root or if sudo exists and is permitted), and quote variable
expansions ("${IMAGE_NAME}" "${DEF_FILE}") to prevent word-splitting; make these
changes around the IMAGE_NAME, DEF_FILE, and the apptainer invocation.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: be38fe55-556f-4e23-8b10-3e6f4adfa3df
📒 Files selected for processing (3)
containers/apptainer/README.mdcontainers/apptainer/annotationgx.defcontainers/apptainer/build.sh
| @@ -0,0 +1,19 @@ | |||
| Bootstrap: docker | |||
| From: rocker/r-ver:4.3.2 | |||
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "=== DESCRIPTION requirement ==="
if [ -f DESCRIPTION ]; then
rg -n "R\s*\(>=" DESCRIPTION || echo "Pattern not found in DESCRIPTION"
else
echo "DESCRIPTION file not found"
fi
echo ""
echo "=== Apptainer base image ==="
if [ -f containers/apptainer/annotationgx.def ]; then
rg -n "^From:" containers/apptainer/annotationgx.def || echo "From not found"
else
echo "annotationgx.def not found"
fiRepository: bhklab/AnnotationGx
Length of output: 171
Base image R version does not meet package requirements.
Line 2 pins rocker/r-ver:4.3.2, but DESCRIPTION requires R (>= 4.5.0). This mismatch will break installation and runtime inside the container.
🔧 Proposed fix
-From: rocker/r-ver:4.3.2
+From: rocker/r-ver:4.5.0📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| From: rocker/r-ver:4.3.2 | |
| From: rocker/r-ver:4.5.0 |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@containers/apptainer/annotationgx.def` at line 2, The base image pinned as
"rocker/r-ver:4.3.2" does not satisfy the package DESCRIPTION which requires R
(>= 4.5.0); update the FROM image tag to a rocker image with R 4.5.0 or newer
(e.g., change "rocker/r-ver:4.3.2" to "rocker/r-ver:4.5.0" or a later patch
release), rebuild the container, and verify R version (R --version) and package
installation succeed; ensure any image-specific tweaks in the container remain
compatible after the upgrade.
| R -e "install.packages('BiocManager', repos='https://cloud.r-project.org')" | ||
| R -e "BiocManager::install('AnnotationGx')" |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
find . -name "annotationgx.def" -type fRepository: bhklab/AnnotationGx
Length of output: 102
🏁 Script executed:
cat -n containers/apptainer/annotationgx.defRepository: bhklab/AnnotationGx
Length of output: 634
🌐 Web query:
Bioconductor R 4.5 compatible release version BiocManager install reproducible builds
💡 Result:
I tried to look this up on the web, but web.run returned no search results for multiple Bioconductor/BiocManager queries, so I can’t provide verified citations right now. Below is the best-effort guidance based on how Bioconductor releases map to R versions (please treat the exact Bioconductor release number as to-be-verified on the Bioconductor “Releases” page when you can access it).
Bioconductor release compatible with R 4.5
Bioconductor ties each release to a specific R major.minor. For R 4.5.x, the corresponding Bioconductor release is very likely Bioconductor 3.22 (Bioconductor releases every ~6 months and advances with the next R x.y).
Verify by checking the Bioconductor “Releases” table for the exact mapping.
Install a specific Bioconductor release with BiocManager (recommended for reproducibility)
install.packages("BiocManager")
## lock to the Bioconductor release that matches your R (or force a specific one)
BiocManager::install(version = "3.22") # verify the exact version for R 4.5
## then install packages
BiocManager::install(c("GenomicRanges", "DESeq2"))To keep an install more repeatable within a project:
BiocManager::install(ask = FALSE, update = FALSE)
BiocManager::valid() # checks consistency of installed Bioconductor packagesReproducible builds: what actually works well in practice
- Freeze package versions with
renv(project-level lockfile):install.packages("renv") renv::init() renv::snapshot()
- Pin the Bioconductor release via
BiocManager::install(version=...)(above). - For strongest reproducibility across machines/CI, use a container:
- Use a Bioconductor-provided Docker image for the matching Bioconductor release / R version, then install only what you need inside that image (and keep the Docker tag fixed).
If you paste the URL (or contents) of the Bioconductor “Releases” page/table you’re using, I can confirm the exact Bioconductor release number for R 4.5 and give you a fully pinned, copy-pasteable setup (BiocManager + renv + Docker tag).
🌐 Web query:
BiocManager install best practices reproducible environments Docker Apptainer containers
💡 Result:
Best practices for reproducible BiocManager installs in Docker / Apptainer
1) Pin the Bioconductor cohort (and therefore the compatible R major/minor)
-
In scripts and Dockerfiles, set the Bioconductor version explicitly:
if (!requireNamespace("BiocManager", quietly=TRUE)) install.packages("BiocManager") BiocManager::install(version = "3.22")
This ensures you’re installing packages from a consistent Bioconductor release cohort. [1]
-
Record what you actually used:
BiocManager::version() sessionInfo()
(Useful for debugging and later rebuilds.) [2]
2) Use the official Bioconductor Docker images and pin the image tag
- Prefer
bioconductor/bioconductor_docker:RELEASE_X_Y(notlatest) so the base environment is fixed. Bioconductor explicitly documents theRELEASE_X_Ytags and notes that omitting a tag will pulllatest(which moves over time). [3] - Extend with your own Dockerfile (recommended over “modify running container + commit”). [3]
Minimal pattern:
FROM bioconductor/bioconductor_docker:RELEASE_3_22
RUN R -e 'BiocManager::install(c("SingleCellExperiment","DESeq2"), ask=FALSE)'(Using ask=FALSE avoids interactive prompts during image builds.) [1]
3) Pin CRAN to a snapshot if you need strict reproducibility
Bioconductor versions are cohort-based, but CRAN is not. If you need “same CRAN as of date X”, set CRAN to a dated snapshot (example shown using Posit Package Manager snapshots): [2]
options(repos = c(CRAN = "https://packagemanager.rstudio.com/all/<snapshot_id_or_date>"))
BiocManager::install(version="3.22")4) Validate that packages match the Bioconductor release
After installs (especially if mixing sources), check consistency:
BiocManager::valid()This flags packages that are “too new/old” for the active Bioconductor version. [1]
5) Prefer binaries inside Bioconductor containers when available
Bioconductor notes that (for newer RELEASE_3_14+ container lines) packages can be installed as binaries via BiocManager::install() in the container, greatly speeding builds and reducing compilation variability. [3]
Apptainer (Singularity) specifics (HPC-friendly)
6) Build an Apptainer image from the pinned Docker tag
Use a fixed Docker tag (e.g. RELEASE_3_22) as your source to keep the base identical to Docker. [3]
Typical pattern:
apptainer build bioc_3_22.sif docker://bioconductor/bioconductor_docker:RELEASE_3_227) Make your container compatible with Apptainer’s execution model
Apptainer recommends Docker/OCI images that: (a) don’t rely on a specific USER instruction (it’s ignored), and (b) don’t install software under /root (often inaccessible when running as a normal user). [4]
That lines up well with Bioconductor’s guidance to persist/install into standard library locations (or mounted library paths) rather than user-home-only paths. [3][4]
Sources
[1] Bioconductor install page (BiocManager install(version=...), ask=FALSE, valid())
[2] OSCA Bioconductor book section on Bioconductor versioning + CRAN snapshot example
[3] Bioconductor “Docker for Bioconductor” documentation (RELEASE tags, avoid latest, extending via Dockerfile, binaries)
[4] Apptainer docs: best practices for Docker/OCI compatibility (USER, /root install locations)
Pin Bioconductor release version for reproducibility.
The unpinned BiocManager::install('AnnotationGx') will pull the latest compatible release, making builds irreproducible across time. The proposed approach using Sys.getenv('BIOC_VERSION') is non-standard; Bioconductor best practice is to hardcode the release version directly or use a pinned Bioconductor Docker base image.
For R 4.3.2, determine the compatible Bioconductor release (e.g., 3.18 or 3.19), then either:
- Hardcode the version:
BiocManager::install(version = "3.XX") # for R 4.3.2
BiocManager::install('AnnotationGx', ask = FALSE, update = FALSE)- Or switch the base image to a pinned Bioconductor release tag:
FROM bioconductor/bioconductor_docker:RELEASE_3_XX(This approach is cleaner and ensures all dependencies are pre-validated for that release cohort.)
Include ask = FALSE, update = FALSE flags to prevent interactive prompts in non-interactive builds.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@containers/apptainer/annotationgx.def` around lines 11 - 12, The Dockerfile
currently calls BiocManager::install('AnnotationGx') without pinning a
Bioconductor release (and the comment references a non-standard SYS env
approach), which makes builds non-reproducible; update the R install step that
invokes BiocManager::install (the R -e lines calling BiocManager::install(...))
to either: a) hardcode the compatible Bioconductor release for R 4.3.2 by
calling BiocManager::install(version = "3.XX") and then install AnnotationGx
with ask = FALSE and update = FALSE, or b) switch the container base to a pinned
bioconductor/bioconductor_docker:RELEASE_3_XX image so you can keep the
BiocManager::install('AnnotationGx') call but rely on a release-matched base;
ensure the chosen approach replaces the unpinned BiocManager::install call and
includes ask = FALSE, update = FALSE to avoid interactive prompts.
| IMAGE_NAME="annotationgx.sif" | ||
| DEF_FILE="annotationgx.def" | ||
|
|
||
| sudo apptainer build ${IMAGE_NAME} ${DEF_FILE} No newline at end of file |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# First, check if the file exists and show its content
if [ -f "containers/apptainer/build.sh" ]; then
echo "=== File content with line numbers ==="
cat -n "containers/apptainer/build.sh"
echo ""
echo "=== Pattern search results ==="
rg -n "sudo apptainer build|DEF_FILE=|IMAGE_NAME=" "containers/apptainer/build.sh" || echo "No matches found"
echo ""
echo "=== Check README ==="
if [ -f "containers/apptainer/README.md" ]; then
rg -n "sudo apptainer build" "containers/apptainer/README.md" || echo "No matches in README"
else
echo "README.md not found at containers/apptainer/"
fi
else
echo "File not found at containers/apptainer/build.sh"
# Try to find it
echo "Searching for build.sh files:"
fd "build.sh" containers/
fiRepository: bhklab/AnnotationGx
Length of output: 502
Build script fails in CI/HPC environments due to hardcoded sudo and relative paths.
Line 7 hardcodes sudo, which is unavailable or restricted in many CI/CD and HPC systems. Lines 4–5 use relative paths without resolving the script's directory, causing failures when invoked from other working directories. Additionally, variables are unquoted, introducing word-splitting risks.
Replace hardcoded sudo with a conditional check, use absolute paths based on script location, add proper quoting, and upgrade to set -euo pipefail for safer execution:
Suggested fix
#!/bin/bash
-set -e
+set -euo pipefail
-IMAGE_NAME="annotationgx.sif"
-DEF_FILE="annotationgx.def"
+SCRIPT_DIR="$(cd -- "$(dirname -- "${BASH_SOURCE[0]}")" && pwd)"
+IMAGE_NAME="${SCRIPT_DIR}/annotationgx.sif"
+DEF_FILE="${SCRIPT_DIR}/annotationgx.def"
-sudo apptainer build ${IMAGE_NAME} ${DEF_FILE}
+if command -v sudo >/dev/null 2>&1; then
+ sudo apptainer build "${IMAGE_NAME}" "${DEF_FILE}"
+else
+ apptainer build "${IMAGE_NAME}" "${DEF_FILE}"
+fi📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| IMAGE_NAME="annotationgx.sif" | |
| DEF_FILE="annotationgx.def" | |
| sudo apptainer build ${IMAGE_NAME} ${DEF_FILE} | |
| #!/bin/bash | |
| set -euo pipefail | |
| SCRIPT_DIR="$(cd -- "$(dirname -- "${BASH_SOURCE[0]}")" && pwd)" | |
| IMAGE_NAME="${SCRIPT_DIR}/annotationgx.sif" | |
| DEF_FILE="${SCRIPT_DIR}/annotationgx.def" | |
| if command -v sudo >/dev/null 2>&1; then | |
| sudo apptainer build "${IMAGE_NAME}" "${DEF_FILE}" | |
| else | |
| apptainer build "${IMAGE_NAME}" "${DEF_FILE}" | |
| fi |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@containers/apptainer/build.sh` around lines 4 - 7, The build script hardcodes
sudo, uses relative paths, and leaves variables unquoted; update it to enable
strict mode (set -euo pipefail), resolve DEF_FILE and IMAGE_NAME to absolute
paths using the script directory (e.g., derive DIR from "${BASH_SOURCE[0]}"),
replace the literal sudo apptainer call with a conditional that uses sudo only
if needed/available (e.g., check if running as root or if sudo exists and is
permitted), and quote variable expansions ("${IMAGE_NAME}" "${DEF_FILE}") to
prevent word-splitting; make these changes around the IMAGE_NAME, DEF_FILE, and
the apptainer invocation.
Summary by CodeRabbit
New Features
Documentation