Describe the bug
- The number of concurrent requests is no longer shown for each data point
- Tokens/prompt is displayed as either 72/8/4/2/1 which I believe is referring to the number of GPUs in the cluster
To Reproduce
Steps to reproduce the behavior:
Go to https://inferencemax.semianalysis.com/, view any graph and mouse over a datapoint.
Expected behavior
The number of concurrent requests in the test scenario should be shown, and the number of GPUs shouldn't be mislabeled as tokens/prompt.
Screenshots
Desktop (please complete the following information):
- OS: Linux
- Browser: Firefox
- Version: 145
Describe the bug
To Reproduce
Steps to reproduce the behavior:
Go to https://inferencemax.semianalysis.com/, view any graph and mouse over a datapoint.
Expected behavior
The number of concurrent requests in the test scenario should be shown, and the number of GPUs shouldn't be mislabeled as tokens/prompt.
Screenshots
Desktop (please complete the following information):