Skip to content

Conversation

@yunfeng-scale
Copy link
Contributor

@yunfeng-scale yunfeng-scale commented Oct 10, 2023

Validate quantization values when creating endpoints


GIT_TAG: str = os.environ.get("GIT_TAG", "GIT_TAG_NOT_FOUND")
if GIT_TAG == "GIT_TAG_NOT_FOUND":
if GIT_TAG == "GIT_TAG_NOT_FOUND" and "pytest" not in sys.modules:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make pytest work without specifying GIT_TAG

LLMInferenceFramework.DEEPSPEED: [],
LLMInferenceFramework.TEXT_GENERATION_INFERENCE: [Quantization.BITSANDBYTES],
LLMInferenceFramework.VLLM: [Quantization.AWQ],
LLMInferenceFramework.LIGHTLLM: [],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably best for a separate pr, but can you update the docs to specify which models in the model zoo support lightllm as inference framework?

)
if num_shards > gpus:
raise ObjectHasInvalidValueException(
f"Num shard {num_shards} must be less than or equal to the number of GPUs {gpus}."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: could mention the inference framework in the error msg

@yunfeng-scale yunfeng-scale enabled auto-merge (squash) October 11, 2023 19:15
@yunfeng-scale yunfeng-scale merged commit 60ac144 into main Oct 12, 2023
@yunfeng-scale yunfeng-scale deleted the yunfeng-validate-quantization branch October 12, 2023 01:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants