Skip to content
This repository was archived by the owner on Jun 24, 2024. It is now read-only.

refactor(llama): remove bincode#123

Merged
philpax merged 3 commits intorustformers:mainfrom
philpax:remove-bincode
Apr 12, 2023
Merged

refactor(llama): remove bincode#123
philpax merged 3 commits intorustformers:mainfrom
philpax:remove-bincode

Conversation

@philpax
Copy link
Copy Markdown
Collaborator

@philpax philpax commented Apr 7, 2023

While writing up #122, I realised it doesn't really make sense to be prescriptive about how to read or write a snapshot to disk. This PR moves all read/write logic to the CLI, and leaves the snapshot/ref as Serde-compatible so that users can make their own decisions on how to snapshot.

The main benefit of this is that it removes the bincode dependency from the library.

@philpax philpax added the meta:maintenance Changes that will make it easier for us to maintain code label Apr 7, 2023
@philpax philpax requested a review from setzer22 April 7, 2023 18:37
@philpax philpax added this to the 0.1 milestone Apr 10, 2023
Copy link
Copy Markdown
Collaborator

@setzer22 setzer22 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! 👍 Not much to add, the change makes sense, we don't need bincode and any llama-rs user can pick the serialization library that best fits them.

But I think it would be worth it to add a comment on InferenceSession to make it clear to users to always pick a binary format (i.e. no JSON or RON) because serializing large number matrices in this format could easily increase the size (of the already quite large snapshots) by several orders of magnitude.

@philpax philpax merged commit 0e553a0 into rustformers:main Apr 12, 2023
@philpax philpax deleted the remove-bincode branch April 12, 2023 23:13
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

meta:maintenance Changes that will make it easier for us to maintain code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants