Skip to content

[BUG]: ImportError: cannot import name 'GeminiModel' from 'colossalai.booster.plugin.gemini_plugin'  #4594

@alphanlp

Description

@alphanlp

🐛 Describe the bug

ImportError: cannot import name 'GeminiModel' from 'colossalai.booster.plugin.gemini_plugin' (/data/llmodel/miniconda3/envs/colossal/lib/python3.9/site-packages/colossalai/booster/plugin/gemini_plugin.py)

the run script is:

set_n_least_used_CUDA_VISIBLE_DEVICES() {
    local n=${1:-"9999"}
    echo "GPU Memory Usage:"
    local FIRST_N_GPU_IDS=$(nvidia-smi --query-gpu=memory.used --format=csv |
        tail -n +2 |
        nl -v 0 |
        tee /dev/tty |
        sort -g -k 2 |
        awk '{print $1}' |
        head -n $n)
    export CUDA_VISIBLE_DEVICES=$(echo $FIRST_N_GPU_IDS | sed 's/ /,/g')
    echo "Now CUDA_VISIBLE_DEVICES is set to:"
    echo "CUDA_VISIBLE_DEVICES=$CUDA_VISIBLE_DEVICES"
}

set_n_least_used_CUDA_VISIBLE_DEVICES 4

torchrun --standalone --nproc_per_node=4 train_sft.py \
    --pretrain "/data/llmodel/model_hub/Llama-2-13b-chat-hf" \
    --model 'llama' \
    --strategy colossalai_gemini \
    --log_interval 10 \
    --save_path llama-13b-test/ \
    --dataset /data/llmodel/datasets/CoT_Chinese_data.json \
    --batch_size 4 \
    --accumulation_steps 8 \
    --lr 2e-5 \
    --max_datasets_size 512 \
    --max_epochs 1

Environment

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions