It should be able to output the Tokens per Second of any give model with 1, 2, 4 and 8 concurrent requests. Provide a hyperfine-like experience.