Feature Request: Expose llama.cpp --no-mmap option

There was a performance regression in earlier versions of llama.cpp that I may be hitting with long running interactions. This was recently fixed with the addition of a --no-mmap option which forces the entire model to be loaded into ram, and I would like to also use it with koboldcpp. 

https://github.com/ggerganov/llama.cpp/pull/801