This repository was archived by the owner on May 6, 2026. It is now read-only.
Document GKE TPU Performance#133
Merged
Merged
Conversation
michaelasp
reviewed
Jun 23, 2025
|
|
||
| Another important factor is the capacity of DraNet to pass Interface configuration options that allow to tune the interfaces for maximume performance, per example, [Big TCP](https://lwn.net/Articles/884104/). | ||
|
|
||
| In addition, if you have GVNIC enabled you can use some private ethtool flags that improve the performance for TCP like [enable-max-rx-buffer-size](enable-max-rx-buffer-size). |
Contributor
There was a problem hiding this comment.
Should we explain what these flags do and why we are setting these values somewhat? Especially the private flags.
Contributor
Author
There was a problem hiding this comment.
yeah, I was trying to find some public docs to reference it but only found this https://github.com/GoogleCloudPlatform/compute-virtual-ethernet-linux/blob/d4e3772aea0fec953f33f1776eea33f8e9d9e2ee/build/gve_ethtool.c#L87
or https://github.com/search?q=enable-max-rx-buffer-size&type=code
Change-Id: Ic2cbf3ce0a4f40932268050db7f1d3ff3053429f
Change-Id: I535e6deebb36a861c32ded91f33549815e7f0275
samos123
reviewed
Jun 23, 2025
| gcloud compute --project=${PROJECT?} \ | ||
| networks subnets create \ | ||
| tpu-net-2-sub \ | ||
| --network=tpu-net-1 \ |
|
It would be helpful to have a before and after picture. with hostNetwork=true vs with dranet. I suspect dranet perf is actually much better. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
It turns out some process like cilium pin their program to the bpf filesystem so we need to delete them to be able to remove the bpf programs, or we'll not be able to detach them because they are still referenced.
Add documentation about how to maximize TCP throughput on TPU v6 machines, by using two virtual interfaces that map to the two physical interfaces of the physical VM, @samos123 you'll be interested on this