There is a small annoyance I found when adding figures to model cards. The problem is that the path to the file on the storage device and the path saved in the model card can differ. This can easily result in model cards references figures that it cannot find. Let me give an example:
I want to create a model card to upload it later to the HF Hub. For this, I create a temp folder and initialize a repo:
hub_dir = Path(mkdtemp())
hub_utils.init(..., dst=hub_dir)
Then I create my model card and add a figure like so:
model_card = card.Card(...)
fig_path = hub_dir / "my_plot.png"
fig.savefig(fig_path)
model_card.add_plot(**{"My beautiful plot": fig_path})
...
hub_utils.push(..., source=hub_dir)
The issue is that the plot is stored as , i.e. using its absolute path. When I now upload the model card to the Hub, the Hub will search for the plot in said temporary directory, which does not exist. On the Hub, what I actually want is to use the relative path, i.e., . But I cannot use the relative path instead of the absolute path either, because then the figure is not uploaded when calling, hub_utils.push(..., source=hub_dir), since it's not in hub_dir, it is in my current working directory.
Now I can work around this by changing my code to
model_card = card.Card(...)
fig.savefig(hub_dir / "my_plot.png") # <= use abs path here
model_card.add_plot(**{"My beautiful plot": "my_plot.png"}) # <= use rel path here
but this is an ugly gotcha. Also, even though it will work correctly on the Hub, when I render the model card locally to preview it, it will fail because the paths mismatch.
Even worse, however, is when I want to add permutation importances. When I do
pi = permutation_importance(model, X, y)
model_card.add_permutation_importances(pi, plot_file="my-permutation-importances.png", ...)
there will be a problem because the method stores the file locally using the plot_file argument, and it will use the same argument in the model card. If I put the relative path to hub_dir there, the file will not be found when the model card is rendered on the Hub. And when I put the relative path there, the file will not be included in the upload. The only solution is not to use add_permutation_importances but to create the figure by hand and call add_plot as discussed above, or I have to manually create a copy the file at the correct location.
I guess the problem could be avoided if the current working directory and the temporary directory for pushing are always identical, but requiring this is also annoying and I don't think we can assume it's always needed.
My guess why we haven't really encountered this problem yet is that in our examples, we either create model cards with plots but don't upload them to the Hub, or we do upload model cards to the Hub but they don't contain plots. We don't have an example that combines both (I'm working on exactly that, which is how I stumbled upon this).
Anyway, I don't have a solution to this problem. We can better document it and maybe warn users if a repo is created with a model card that references plots that are not inside the repo. But that's not very satisfying. Anyone got better ideas?
There is a small annoyance I found when adding figures to model cards. The problem is that the path to the file on the storage device and the path saved in the model card can differ. This can easily result in model cards references figures that it cannot find. Let me give an example:
I want to create a model card to upload it later to the HF Hub. For this, I create a temp folder and initialize a repo:
Then I create my model card and add a figure like so:
The issue is that the plot is stored as
, i.e. using its absolute path. When I now upload the model card to the Hub, the Hub will search for the plot in said temporary directory, which does not exist. On the Hub, what I actually want is to use the relative path, i.e.,. But I cannot use the relative path instead of the absolute path either, because then the figure is not uploaded when calling,hub_utils.push(..., source=hub_dir), since it's not inhub_dir, it is in my current working directory.Now I can work around this by changing my code to
but this is an ugly gotcha. Also, even though it will work correctly on the Hub, when I render the model card locally to preview it, it will fail because the paths mismatch.
Even worse, however, is when I want to add permutation importances. When I do
there will be a problem because the method stores the file locally using the
plot_fileargument, and it will use the same argument in the model card. If I put the relative path tohub_dirthere, the file will not be found when the model card is rendered on the Hub. And when I put the relative path there, the file will not be included in the upload. The only solution is not to useadd_permutation_importancesbut to create the figure by hand and calladd_plotas discussed above, or I have to manually create a copy the file at the correct location.I guess the problem could be avoided if the current working directory and the temporary directory for pushing are always identical, but requiring this is also annoying and I don't think we can assume it's always needed.
My guess why we haven't really encountered this problem yet is that in our examples, we either create model cards with plots but don't upload them to the Hub, or we do upload model cards to the Hub but they don't contain plots. We don't have an example that combines both (I'm working on exactly that, which is how I stumbled upon this).
Anyway, I don't have a solution to this problem. We can better document it and maybe warn users if a repo is created with a model card that references plots that are not inside the repo. But that's not very satisfying. Anyone got better ideas?