Skip to content

Create "docker model bench MODEL" #480

@ericcurtin

Description

@ericcurtin

It should be able to output the Tokens per Second of any give model with 1, 2, 4 and 8 concurrent requests. Provide a hyperfine-like experience.

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions