Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 5 additions & 4 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# syntax=docker/dockerfile:1.7-labs
FROM nvcr.io/nvidia/pytorch:25.05-py3
FROM nvcr.io/nvidia/pytorch:25.11-py3

# Install dependencies.
RUN apt-get update \
Expand Down Expand Up @@ -29,16 +29,17 @@ ENV PIP_CONSTRAINT=""
# There is no pre-build mamba image for pytorch 2.8, we build it before the rest to avoid rebuilds.
Copy link

Copilot AI Dec 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Outdated comment: This comment references "pytorch 2.8" but the PR description indicates the new base image (25.11) includes PyTorch 2.10. The comment should be updated to reflect the current PyTorch version to avoid confusion.

Suggested change
# There is no pre-build mamba image for pytorch 2.8, we build it before the rest to avoid rebuilds.
# There is no pre-build mamba image for pytorch 2.10, we build it before the rest to avoid rebuilds.

Copilot uses AI. Check for mistakes.
# We need to compile from the repo because of https://github.com/state-spaces/mamba/issues/720 (same for causal-conv1d)
Copy link

Copilot AI Dec 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Partially outdated comment: This comment states "We need to compile from the repo because of state-spaces/mamba#720 (same for causal-conv1d)". However, line 33 now installs mamba-ssm from PyPI (not from git repo), so the comment is only accurate for causal-conv1d (line 32). Consider updating the comment to clarify which packages still require compilation from git and which are now installed from PyPI.

Suggested change
# We need to compile from the repo because of https://github.com/state-spaces/mamba/issues/720 (same for causal-conv1d)
# We need to compile causal-conv1d from the repo because of https://github.com/state-spaces/mamba/issues/720.
# mamba-ssm is now installed from PyPI.

Copilot uses AI. Check for mistakes.
# We set the number of workers to avoid OOM when compiling on laptop. (TODO: Can we make it configurable?)
RUN MAX_JOBS=2 pip install --no-build-isolation "causal-conv1d@git+https://github.com/Dao-AILab/causal-conv1d@2a288a1"
RUN MAX_JOBS=2 pip install --no-build-isolation "mamba_ssm[causal-conv1d]@git+https://github.com/state-spaces/mamba@4a8a2a2"
RUN MAX_JOBS=2 pip install --no-build-isolation "causal-conv1d @ git+https://github.com/Dao-AILab/causal-conv1d@v1.5.4"
RUN MAX_JOBS=2 pip install --no-build-isolation mamba-ssm==2.2.6.post3
Copy link

Copilot AI Dec 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Package name inconsistency: This line uses mamba-ssm (with hyphen) while setup.cfg line 55 uses mamba_ssm (with underscore). For consistency across the codebase, both should use the same format. Recommend using mamba_ssm to match setup.cfg and the import statements used throughout the codebase.

Suggested change
RUN MAX_JOBS=2 pip install --no-build-isolation mamba-ssm==2.2.6.post3
RUN MAX_JOBS=2 pip install --no-build-isolation mamba_ssm==2.2.6.post3

Copilot uses AI. Check for mistakes.
RUN MAX_JOBS=2 pip install --no-build-isolation "flash-linear-attention @ git+https://github.com/fla-org/flash-linear-attention@67eee20c8503cd19eeb52aa1b99821308e9260c5"
# Copy dependency files with universal write permissions for all users.
COPY --chmod=777 setup.py setup.cfg pyproject.toml ./
COPY --chmod=777 ./fast_llm_external_models/__init__.py fast_llm_external_models/
COPY --chmod=777 ./fast_llm/__init__.py fast_llm/
COPY --chmod=777 ./fast_llm/csrc/ fast_llm/csrc/

# Install dependencies within the virtual environment.
RUN pip install --no-cache-dir --no-build-isolation -e ".[CORE,OPTIONAL,HUGGINGFACE,SSM,VISION,GENERATION,DEV]" triton==3.1.0
RUN pip install --no-cache-dir --no-build-isolation -e ".[CORE,OPTIONAL,HUGGINGFACE,SSM,VISION,GENERATION,DEV]" triton==3.5.1

# Copy the remaining source code with universal write permissions.
COPY --chmod=777 ./Megatron-LM Megatron-LM
Expand Down
12 changes: 6 additions & 6 deletions setup.cfg
Original file line number Diff line number Diff line change
Expand Up @@ -25,10 +25,10 @@ CORE =
# Used for checkpoints
safetensors>=0.5.3
# Update the base image (version fixed to ensure there is a wheel for the base image), may need --no-build-isolation
flash-attn==2.7.3
# Dropless MLP is broken with triton 3.2.0, 3.3.0 and 3.3.1. TODO: Remove once a working triton version is released.
# TODO: Removed because it breaks cpu-only installs and pip dependency resolution.
# triton==3.1.0
flash-attn==2.7.4.post1
# Dropless MoE kernel is broken with triton >= 3.2.0 and needs a rewrite (also limited to 32 experts).
# Not pinning triton here as it breaks cpu-only installs and pip dependency resolution.
# triton==3.5.1


# Small packages required for some optional features and tools.
Expand All @@ -52,8 +52,8 @@ HUGGINGFACE =
# To install on cpu environment (ex. for IDE support):
# MAMBA_FORCE_BUILD=TRUE CAUSAL_CONV1D_FORCE_BUILD=TRUE CAUSAL_CONV1D_SKIP_CUDA_BUILD=TRUE pip install -e ".[CORE,SSM]" --no-build-isolation
SSM =
mamba_ssm[causal-conv1d]==2.2.4
flash-linear-attention @ git+https://github.com/fla-org/flash-linear-attention@main
mamba_ssm[causal-conv1d]==2.2.6.post3
flash-linear-attention @ git+https://github.com/fla-org/flash-linear-attention@67eee20c8503cd19eeb52aa1b99821308e9260c5

GENERATION =
lm_eval>=0.4.9
Expand Down