Now that distributed inference is supported thanks to the work of @evanmiller in #2099 it would be fun to try to utilize it for something cool. One such idea is to connect a bunch of Raspberry Pis in a local network and run the inference using MPI:
# sample cluster of 8 devices (replace with actual IP addresses of the devices)
$ cat ./hostfile
192.168.0.1:1
192.168.0.2:1
192.168.0.3:1
192.168.0.4:1
192.168.0.5:1
192.168.0.6:1
192.168.0.7:1
192.168.0.8:1
# build with MPI support
$ make CC=mpicc CXX=mpicxx LLAMA_MPI=1 -j
# run distributed inference over 8 nodes
$ mpirun -hostfile ./hostfile -n 8 ./main -m /mnt/models/65B/ggml-model-q4_0.bin -p "I believe the meaning of life is" -n 64
Here we assume that the 65B model data is located on a network share in /mnt and that mmap works over a network share.
Not sure if that is the case - if not, then it would be more difficult to perform this experiment.
Looking for people with access to the necessary hardware to perform this experiment
Now that distributed inference is supported thanks to the work of @evanmiller in #2099 it would be fun to try to utilize it for something cool. One such idea is to connect a bunch of Raspberry Pis in a local network and run the inference using MPI:
Here we assume that the 65B model data is located on a network share in
/mntand thatmmapworks over a network share.Not sure if that is the case - if not, then it would be more difficult to perform this experiment.
Looking for people with access to the necessary hardware to perform this experiment