Conversation
|
I think this is good enough until we update the model files. |
|
Perplexity with Details$ ./perplexity -m models/Llama-2-7B-GGML/llama-2-7b.ggmlv3.q5_K_M.bin -f wikitext-2-raw/wiki.test.raw -t 1 -ngl 99 system_info: n_threads = 1 / 16 | AVX = 1 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 1 | VSX = 0 | llama_print_timings: load time = 2575.04 ms $ ./perplexity -m models/Llama-2-7B-GGML/llama-2-7b.ggmlv3.q5_K_M.bin -f wikitext-2-raw/wiki.test.raw -t 1 -ngl 99 -eps 1 system_info: n_threads = 1 / 16 | AVX = 1 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 1 | VSX = 0 | llama_print_timings: load time = 3436.11 ms |
|
Thanks - will take a look tmrw |
Fixes #2373
Use
-eps 1e-5with llama 2, defaults to1e-6(same as current, for llama v1).