-
Notifications
You must be signed in to change notification settings - Fork 5
Description
Hi Kevin, when I'm trying to reproduce the Prompt Alignment Experiment, I downloaded the llava_server codebase using weights from "liuhaotian/llava-v1.5-7b" first, when I run
gunicorn "app:create_app()
I got KeyError: 'llava'. when loading weights
To handle this, I cloned the latest llava from https://github.com/haotian-liu/LLaVA and modified the llava_server/llava.py:
from typing import Iterable, List
from transformers import AutoTokenizer, AutoConfig, LlamaConfig
import torch
import numpy as np
from llava.utils import disable_torch_init
from transformers import CLIPImageProcessor
from PIL import Image
from llava.conversation import simple_conv_multimodal
from llava.model.language_model.llava_llama import LlavaLlamaForCausalLM
DEFAULT_IMAGE_TOKEN = "<image>"
DEFAULT_IMAGE_PATCH_TOKEN = "<im_patch>"
DEFAULT_IM_START_TOKEN = "<im_start>"
DEFAULT_IM_END_TOKEN = "<im_end>"
MAX_TOKENS = 64
PROMPT = simple_conv_multimodal.get_prompt() + "Human: "
def load_llava(params_path):
# load model
params_path = "liuhaotian/llava-v1.5-7b"
disable_torch_init()
tokenizer = AutoTokenizer.from_pretrained(params_path)
class LlavaConfig(LlamaConfig):
model_type = "llava"
AutoConfig.register("llava", LlavaConfig)
model = LlavaLlamaForCausalLM.from_pretrained(
params_path, torch_dtype=torch.float16
).cuda()
Now I'm testing this code on the machine with 3 A100 GPUs, it could load the weights and setup servers with app.py
However, when I use 2 GPUs for llava inference and run train.py on the other, I got:
"images = images.to("cuda", dtype=torch.float16)
RuntimeError: CUDA error: device-side assert triggered"
I also checked the nvidia smi that my processes were indeed on three GPUs separately. May I know if you have could help me with this? Thank you!