Refactor most code in main.cpp into a separate module (preparing to implement TCP mode)#267
Refactor most code in main.cpp into a separate module (preparing to implement TCP mode)#267tarruda wants to merge 5 commits intoggml-org:masterfrom
Conversation
Signed-off-by: Thiago Padilha <thiago@padilha.cc>
9132124 to
934840d
Compare
Signed-off-by: Thiago Padilha <thiago@padilha.cc>
Signed-off-by: Thiago Padilha <thiago@padilha.cc>
Signed-off-by: Thiago Padilha <thiago@padilha.cc>
The goal is to allow running llama_main while connected to other streams, such as TCP sockets. Signed-off-by: Thiago Padilha <thiago@padilha.cc>
934840d to
edc17cf
Compare
|
How does this PR tie into the current active refactor here #77 ? |
I was not aware of that PR, I should have searched it first. The only reason I created this PR is because I had a clear vision of how to implement a TCP server mode into llama.cpp. Honestly not sure what to do, should I close this PR? |
Not my call, but you could review the other PR with your insight :) |
I had a quick look and it seems that the goal in #77 is to make llama.cpp embeddable as a library, which requires modifying/refactoring more than what I do here. This PR has no such goals and makes almost no changes to existing code. It can be summarized as:
|
|
Closing in favor of #278 |
The goal of this refactor is allow reusing the model execution while using streams other than stdin/stdout for interaction.
In my case, I'd like to implement a simple TCP server (which is enabled as a command-line option) that will run
llama_mainfor each new connection, which will be handled in a child process viafork(). This would bring a few benefits:If this PR is accepted, I will follow up with a PR that implements the TCP server command line option
This PR is simpler to review than it appears. Just look at the commits individually (most of the additions/deletions happen in the first commit, where
main.cppis simply renamed asllama.cpp).