Default value for max memory pool too low.
When running pytorch and allocating large amounts of memory (>8M) the following error appears:
[1740171758.025024] [gpu6:1025296:0] mpool.c:269 UCX ERROR Failed to allocate memory pool (name=ud_recv_skb) chunk: Input/output error
[1740171758.025504] [gpu6:1025296:0] ib_md.c:282 UCX ERROR ibv_reg_mr(address=0x7ffdd8600000, length=20971520, access=0x10000f) failed: Cannot allocate memory : Please set max locked memory (ulimit -l) to 'unlimited' (current: 8192 kbytes)
ulimit -a
real-time non-blocking time (microseconds, -R) unlimited
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 3092588
max locked memory (kbytes, -l) 8192
max memory size (kbytes, -m) unlimited
open files (-n) 1024
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) 513293
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
ulimit -l unlimited
ulimit -a
real-time non-blocking time (microseconds, -R) unlimited
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 3092588
max locked memory (kbytes, -l) unlimited
max memory size (kbytes, -m) unlimited
open files (-n) 1024
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) 513293
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
``
### Expected Results
See above
### Additional information
See above
Code of Conduct
What is the problem?
Default value for max memory pool too low.
When running pytorch and allocating large amounts of memory (>8M) the following error appears:
Steps to Reproduce