Skip to content

[BUG]: The version of ColossalAI in docker image is error #3264

@liuzeming-yuxi

Description

@liuzeming-yuxi

🐛 Describe the bug

It appears that on Docker Hub, all image DIGESTs have been the same since version 0.2.0. I tried to check the version of colossalai in the image by pip list and found that they are all 0.2.0+torch1.12cu11.3:

Package                Version
---------------------- ---------------------
apex                   0.1
bcrypt                 4.0.1
brotlipy               0.7.0
certifi                2022.12.7
cffi                   1.15.0
cfgv                   3.3.1
charset-normalizer     2.0.4
click                  8.1.3
colorama               0.4.4
colossalai             0.2.0+torch1.12cu11.3

Using git log, I also discovered that the latest code modification time is incorrect:

commit 53bb8682a2e5a0bfe3e3925d943f13ebc9df879d (HEAD -> main, origin/main, origin/HEAD)
Author: Frank Lee <somerlee.9@gmail.com>
Date:   Mon Jan 9 17:57:57 2023 +0800

    [worfklow] added coverage test (#2399)

    * [worfklow] added coverage test

    * polish code

    * polish code

    * polish code

    * polish code

    * polish code

    * polish code

    * polish code

    * polish code

commit ea13a201bbd7eb6022069c8379f3626f9788b0f9
Author: HELSON <c2h214748@gmail.com>
Date:   Mon Jan 9 17:41:38 2023 +0800

    [polish] polish code for get_static_torch_model (#2405)

    * [gemini] polish code

    * [testing] remove code

    * [gemini] make more robust

commit 551cafec14477f17da38d671106341cdc8fed5ff
Author: Frank Lee <somerlee.9@gmail.com>
Date:   Mon Jan 9 17:13:53 2023 +0800

    [doc] updated kernel-related optimisers' docstring (#2385)

Wow,It looks that the Github action failed to build the latest version of the image. By viewing the action code, I found that Github used the same cache during git clone colossalai.
Action log in [release] v0.2.7:

Step 4/6 : RUN git clone https://github.com/***/ColossalAI.git     && cd ./ColossalAI     && CUDA_EXT=1 pip install -v --no-cache-dir .
 ---> Using cache
 ---> cf4ff39bda10

Action log in [release] v0.2.0:

Step 4/6 : RUN git clone https://github.com/***/ColossalAI.git     && cd ./ColossalAI     && CUDA_EXT=1 pip install -v --no-cache-dir .
 ---> Using cache
 ---> cf4ff39bda10

On Moby, someone raised the same question. One solution is to add an ARG in the Dockerfile and use it in docker build to avoid using cache: moby/moby#1996 (comment)

Environment

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions