Skip to content

Drastically changed results for DeepSeek R1 0528 fp8 8k/1k #192

@asb

Description

@asb

The previously reported results (taken from here):

Image

The results from the website for the 11/04/2025, 03:41:04 UTC run:

Image

The end-to-end latency results are generally lower (much lower for the GB200 NVL72), while the token throughput per GPU appears to be much diminished for anything other than the GB200 NVL72. Is this an expected change? If so, what is the reason?

If this is an expected change (e.g. due to a big fix or similar), it would be very helpful to have a high level changelog on the results website to note big changes - e.g. changing of inference engine that results in very different results, or a bugfix that has the same impact etc. i.e. something much higher level than what you would get trawling through the commit history (which has plenty of changes that have no expected impact on the reported results).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions