On machines with smaller memory and slower processors, it can be useful to reduce the overall number of threads running. For instance on my MacBook Pro Intel i5 16Gb machine, 4 threads is much faster than 8. Try:
make -j && ./main -m ./models/7B/ggml-model-q4_0.bin -p "Author-contribution statements and acknowledgements in research papers should state clearly and specifically whether, and to what extent, the authors used AI technologies such as ChatGPT " -t 4 -n 512
On machines with smaller memory and slower processors, it can be useful to reduce the overall number of threads running. For instance on my MacBook Pro Intel i5 16Gb machine, 4 threads is much faster than 8. Try:
make -j && ./main -m ./models/7B/ggml-model-q4_0.bin -p "Author-contribution statements and acknowledgements in research papers should state clearly and specifically whether, and to what extent, the authors used AI technologies such as ChatGPT " -t 4 -n 512