Skip to content

Launcher script does not work if hosts require 2-factor authentication #4

@jamesr66a

Description

@jamesr66a

With a simple invocation of run_varuna:

$ python -m varuna.run_varuna --batch_size 3 --nstages 3 --chunk_size 1 varuan_test.py

The script has the following terminal output:

No apex!
No apex
['127.0.0.1']
ssh 127.0.0.1 echo "python -u -m varuna.launcher --ngpus_per_server 4   --node_rank 0 --nservers 1 --master_addr 127.0.0.1 --nstages 3 --batch_size 3 --chunk_size 1 --code_dir /fsx/users/jamesreed/varuna varuan_test.py" > launch_varuna.sh;  VARUNA_MANAGER_IP=10.200.30.184 VARUNA_MORPH_PORT=4200 VARUNA_HEARTBEAT_PORT=5000  bash launch_varuna.sh

With the last line (i believe) indicating that run_varuna is trying to use ssh to to issue a command on the host (in this case localhost). However, this host requires two-factor authentication on login. run_varuna launches this command (twice, seemingly) as a background process and ostensibly with stdin disconnected, thus the interactive 2fa prompt cannot be completed. It would be good to have a way to issue commands to the host that is compatible with common security policies or is built on top of standard cluster management tools

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions