Note: This issue was copied from ggml-org#6263
Original Author: @phymbert
Original Issue Number: ggml-org#6263
Created: 2024-03-23T17:03:49Z
Context
there is no advantage to increase n_batch above n_ubatch with embeddings models with pooling, because the entire batch must fit in a physical batch (ie. n_ubatch). n_batch is always >= n_ubatch.
Proposition
Exit failure if --embedding is set and --ubatch-size != --batch-size in the server example. Probably also in the retrieval example in ggml-org#6193.
Aldo probably KV bert.context_size must be taken into account.
Note: This issue was copied from ggml-org#6263
Original Author: @phymbert
Original Issue Number: ggml-org#6263
Created: 2024-03-23T17:03:49Z
Context
there is no advantage to increase
n_batchaboven_ubatchwith embeddings models with pooling, because the entire batch must fit in a physical batch (ie.n_ubatch).n_batchis always>= n_ubatch.--threadsand--threads,--ubatch-size,--log-disableggml-org/llama.cpp#6254 (comment)Proposition
Exit failure if
--embeddingis set and--ubatch-size!=--batch-sizein theserverexample. Probably also in theretrievalexample in ggml-org#6193.Aldo probably KV
bert.context_sizemust be taken into account.