Conversation
1997bf6 to
6a185ca
Compare
|
Let's merge this to master as it add-only and doesn't hurt as a starting point. I successfully built it on colab, but no way to test this locally. I'll update the docs and let see out of bug reports. |
|
Might be worth dropping this command in a readme that should allow folks to test that they have a valid detectable GPU: Example output showing a valid GPU: |
|
Good stuff! Although it seems that The following solves the issue: Which is necessary, otherwise llama.cpp compiles without With |
good catch @Thireus! thanks! - do you also have a GPU at hand so you can test this out? also, do you feel taking a stab at fixing it? otherwise I'll have a look soon |
|
Hey there! Ive run into a couple of issues: - name: gpt-3.5-turbo
parameters:
model: Manticore-13B.ggmlv3.q4_0.bin
temperature: 0.3
context_size: 2048
threads: 6
backend: llama
stopwords:
- "USER:"
- "### Instruction:"
roles:
user: "USER:"
system: "ASSISTANT:"
assistant: "ASSISTANT:"
gpu_layers: 40
Using the provided yaml like in model-gallery yield the error Cheers! |
Depends on: go-skynet/go-llama.cpp#51
See upstream PR: ggml-org/llama.cpp#1412
Allows to build LocalAI with the
llama.cppbackend with cublas/openblas:Cublas
To build, run:
OpenBLAS
To set the number of GPU layers, in the config file:
This also drops the "generic" build type, as I'm sunsetting it in favor of specific cmake parameters
Related to: #69