Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 6 additions & 2 deletions ARCHITECTURE.md
Original file line number Diff line number Diff line change
Expand Up @@ -109,7 +109,8 @@ Built on russh and russh-sftp with custom tokio_client wrapper:
- Host key verification (known_hosts support)
- Command execution with streaming output
- SFTP file transfers (upload/download)
- Connection timeout and keepalive handling
- Connection timeout handling
- Configurable SSH keepalive (ServerAliveInterval, ServerAliveCountMax)

### Terminal User Interface (TUI)
**Documentation**: [docs/architecture/tui.md](./docs/architecture/tui.md)
Expand Down Expand Up @@ -265,7 +266,10 @@ See [LICENSE](./LICENSE) file for licensing information.
- **Parallelism**: Adjust `--parallel` flag (default: 10)
- **Connection timeout**: Use `--connect-timeout` (default: 30s)
- **Command timeout**: Use `--timeout` (default: 5min)
- **Keep-alive**: Automatic via russh (every 30s)
- **Keepalive**: Configurable via `--server-alive-interval` (default: 60s) and `--server-alive-count-max` (default: 3)
- Interval of 0 disables keepalive
- Connection is considered dead after `interval * (count_max + 1)` seconds without response
- Equivalent to OpenSSH `ServerAliveInterval` and `ServerAliveCountMax` options

### Configuration Schema

Expand Down
14 changes: 14 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@ A high-performance SSH client with **SSH-compatible syntax** for both single-hos
- **Interactive Mode**: Interactive shell sessions with single-node or multiplexed multi-node support
- **SSH Config Caching**: High-performance caching of SSH configurations with TTL and file modification detection
- **Configurable Timeouts**: Set both connection timeout (`--connect-timeout`) and command execution timeout (`--timeout`) with support for unlimited execution
- **SSH Keepalive**: Configurable keepalive settings (`--server-alive-interval`, `--server-alive-count-max`) to prevent idle connection timeouts

## Installation

Expand Down Expand Up @@ -233,6 +234,15 @@ bssh -C production --connect-timeout 10 "uptime"
# Different timeouts for connection and command
bssh -C production --connect-timeout 5 --timeout 600 "long-running-job"

# Configure SSH keepalive (prevent idle connection timeouts)
bssh -C production --server-alive-interval 30 "long-running-job"

# Disable keepalive (set interval to 0)
bssh -C production --server-alive-interval 0 "quick-job"

# Keepalive with custom max retries (default: 3)
bssh -C production --server-alive-interval 60 --server-alive-count-max 5 "long-running-job"

# Fail-fast mode: stop immediately on any failure (pdsh -k compatible)
bssh -k -H "web1,web2,web3" "deploy.sh"
bssh --fail-fast -C production "critical-script.sh"
Expand Down Expand Up @@ -701,6 +711,8 @@ defaults:
parallel: 10
timeout: 300 # Command timeout in seconds (0 for unlimited)
jump_host: bastion.example.com # Global default jump host (optional)
server_alive_interval: 60 # SSH keepalive interval in seconds (0 to disable)
server_alive_count_max: 3 # Max keepalive messages without response

# Global interactive mode settings (optional)
interactive:
Expand Down Expand Up @@ -1162,6 +1174,8 @@ Options:
-p, --parallel <PARALLEL> Maximum parallel connections [default: 10]
--timeout <TIMEOUT> Command timeout in seconds (0 for unlimited) [default: 300]
--connect-timeout <SECONDS> SSH connection timeout in seconds (minimum: 1) [default: 30]
--server-alive-interval <SECONDS> SSH keepalive interval in seconds (0 to disable) [default: 60]
--server-alive-count-max <COUNT> Max keepalive messages without response [default: 3]
--output-dir <OUTPUT_DIR> Output directory for command results
-N, --no-prefix Disable hostname prefix in output (pdsh -N compatibility)
-v, --verbose Increase verbosity (-v, -vv, -vvv)
Expand Down
10 changes: 10 additions & 0 deletions docs/architecture/configuration.md
Original file line number Diff line number Diff line change
Expand Up @@ -85,7 +85,10 @@ pub struct Defaults {
pub port: Option<u16>,
pub ssh_key: Option<PathBuf>,
pub parallel: Option<usize>,
pub timeout: Option<u64>,
pub jump_host: Option<String>, // Global default jump host
pub server_alive_interval: Option<u64>, // SSH keepalive interval
pub server_alive_count_max: Option<usize>, // Max keepalive attempts
}

pub struct Cluster {
Expand All @@ -98,7 +101,11 @@ pub struct ClusterDefaults {
pub user: Option<String>,
pub port: Option<u16>,
pub ssh_key: Option<PathBuf>,
pub parallel: Option<usize>,
pub timeout: Option<u64>,
pub jump_host: Option<String>, // Cluster-level jump host
pub server_alive_interval: Option<u64>, // SSH keepalive interval
pub server_alive_count_max: Option<usize>, // Max keepalive attempts
}

// Node can be simple string or detailed config
Expand Down Expand Up @@ -133,7 +140,10 @@ defaults:
user: admin
port: 22
ssh_key: ~/.ssh/id_rsa
timeout: 300
jump_host: global-bastion.example.com # Default for all clusters
server_alive_interval: 60 # SSH keepalive interval in seconds
server_alive_count_max: 3 # Max keepalive attempts before disconnect

clusters:
production:
Expand Down
4 changes: 4 additions & 0 deletions docs/architecture/ssh-client.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,10 @@
- Support for SSH agent, key-based, and password authentication
- Configurable timeouts and retry logic
- Full SFTP support for file transfers
- SSH keepalive support via `SshConnectionConfig`:
- `keepalive_interval`: Interval between keepalive packets (default: 60s, 0 to disable)
- `keepalive_max`: Maximum unanswered keepalive packets before disconnect (default: 3)
- Equivalent to OpenSSH `ServerAliveInterval` and `ServerAliveCountMax`

**Security Implementation:**
- Host key verification with three modes:
Expand Down
40 changes: 40 additions & 0 deletions src/app/dispatcher.rs
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@ use bssh::{
config::InteractiveMode,
pty::PtyConfig,
security::get_sudo_password,
ssh::tokio_client::{SshConnectionConfig, DEFAULT_KEEPALIVE_INTERVAL, DEFAULT_KEEPALIVE_MAX},
};
use std::path::{Path, PathBuf};
use std::sync::Arc;
Expand Down Expand Up @@ -433,6 +434,44 @@ async fn handle_exec_command(cli: &Cli, ctx: &AppContext, command: &str) -> Resu
tracing::info!("Using jump host: {}", jh);
}

// Build SSH connection config with precedence: CLI > SSH config > YAML config > defaults
let keepalive_interval = cli
.server_alive_interval
.or_else(|| {
ctx.ssh_config
.get_int_option(hostname.as_deref(), "serveraliveinterval")
.map(|v| v as u64)
})
.or_else(|| ctx.config.get_server_alive_interval(effective_cluster_name))
.unwrap_or(DEFAULT_KEEPALIVE_INTERVAL);

let keepalive_max = cli
.server_alive_count_max
.or_else(|| {
ctx.ssh_config
.get_int_option(hostname.as_deref(), "serveralivecountmax")
.map(|v| v as usize)
})
.or_else(|| {
ctx.config
.get_server_alive_count_max(effective_cluster_name)
})
.unwrap_or(DEFAULT_KEEPALIVE_MAX);

let ssh_connection_config = SshConnectionConfig::new()
.with_keepalive_interval(if keepalive_interval == 0 {
None
} else {
Some(keepalive_interval)
})
.with_keepalive_max(keepalive_max);

tracing::debug!(
"SSH keepalive config: interval={:?}s, max={}",
ssh_connection_config.keepalive_interval,
ssh_connection_config.keepalive_max
);

let params = ExecuteCommandParams {
nodes: ctx.nodes.clone(),
command,
Expand Down Expand Up @@ -461,6 +500,7 @@ async fn handle_exec_command(cli: &Cli, ctx: &AppContext, command: &str) -> Resu
batch: cli.batch,
fail_fast: cli.fail_fast,
ssh_config: Some(&ctx.ssh_config),
ssh_connection_config,
};
execute_command(params).await
}
Expand Down
14 changes: 14 additions & 0 deletions src/cli/bssh.rs
Original file line number Diff line number Diff line change
Expand Up @@ -189,6 +189,20 @@ pub struct Cli {
)]
pub connect_timeout: u64,

#[arg(
long = "server-alive-interval",
value_name = "SECONDS",
help = "Keepalive interval in seconds (default: 60, 0 to disable)\nSends keepalive packets to prevent idle connection timeouts.\nMatches OpenSSH ServerAliveInterval option."
)]
pub server_alive_interval: Option<u64>,

#[arg(
long = "server-alive-count-max",
value_name = "COUNT",
help = "Max keepalive messages without response before disconnect (default: 3)\nConnection is considered dead after this many missed keepalives.\nMatches OpenSSH ServerAliveCountMax option."
)]
pub server_alive_count_max: Option<usize>,

#[arg(
long,
help = "Require all nodes to succeed (v1.0-v1.1 behavior)\nDefault: return main rank's exit code (v1.2+)\nUseful for health checks and monitoring where all nodes must be operational"
Expand Down
2 changes: 2 additions & 0 deletions src/cli/pdsh.rs
Original file line number Diff line number Diff line change
Expand Up @@ -325,6 +325,8 @@ impl PdshCli {
local_forwards: Vec::new(),
remote_forwards: Vec::new(),
dynamic_forwards: Vec::new(),
server_alive_interval: None,
server_alive_count_max: None,
}
}
}
Expand Down
6 changes: 5 additions & 1 deletion src/commands/exec.rs
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@ use crate::forwarding::ForwardingType;
use crate::node::Node;
use crate::security::SudoPassword;
use crate::ssh::known_hosts::StrictHostKeyChecking;
use crate::ssh::tokio_client::SshConnectionConfig;
use crate::ssh::SshConfig;
use crate::ui::OutputFormatter;
use crate::utils::output::save_outputs_to_files;
Expand Down Expand Up @@ -49,6 +50,8 @@ pub struct ExecuteCommandParams<'a> {
pub batch: bool,
pub fail_fast: bool,
pub ssh_config: Option<&'a SshConfig>,
/// SSH connection configuration (keepalive settings)
pub ssh_connection_config: SshConnectionConfig,
}

pub async fn execute_command(params: ExecuteCommandParams<'_>) -> Result<()> {
Expand Down Expand Up @@ -217,7 +220,8 @@ async fn execute_command_without_forwarding(params: ExecuteCommandParams<'_>) ->
.with_sudo_password(params.sudo_password)
.with_batch_mode(params.batch)
.with_fail_fast(params.fail_fast)
.with_ssh_config(params.ssh_config.cloned());
.with_ssh_config(params.ssh_config.cloned())
.with_ssh_connection_config(params.ssh_connection_config);

// Set keychain usage if on macOS
#[cfg(target_os = "macos")]
Expand Down
36 changes: 36 additions & 0 deletions src/config/resolver.rs
Original file line number Diff line number Diff line change
Expand Up @@ -186,4 +186,40 @@ impl Config {
.filter(|s| !s.is_empty())
.map(|s| expand_env_vars(s))
}

/// Get SSH keepalive interval for a cluster.
///
/// Resolution priority (highest to lowest):
/// 1. Cluster-level `server_alive_interval`
/// 2. Global default `server_alive_interval`
///
/// Returns None if not specified (defaults will be applied at connection time).
pub fn get_server_alive_interval(&self, cluster_name: Option<&str>) -> Option<u64> {
if let Some(cluster_name) = cluster_name {
if let Some(cluster) = self.get_cluster(cluster_name) {
if let Some(interval) = cluster.defaults.server_alive_interval {
return Some(interval);
}
}
}
self.defaults.server_alive_interval
}

/// Get SSH keepalive count max for a cluster.
///
/// Resolution priority (highest to lowest):
/// 1. Cluster-level `server_alive_count_max`
/// 2. Global default `server_alive_count_max`
///
/// Returns None if not specified (defaults will be applied at connection time).
pub fn get_server_alive_count_max(&self, cluster_name: Option<&str>) -> Option<usize> {
if let Some(cluster_name) = cluster_name {
if let Some(cluster) = self.get_cluster(cluster_name) {
if let Some(count) = cluster.defaults.server_alive_count_max {
return Some(count);
}
}
}
self.defaults.server_alive_count_max
}
}
14 changes: 14 additions & 0 deletions src/config/types.rs
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,13 @@ pub struct Defaults {
/// Jump host specification for all connections.
/// Empty string explicitly disables jump host inheritance.
pub jump_host: Option<String>,
/// SSH keepalive interval in seconds.
/// Sends keepalive packets to prevent idle connection timeouts.
/// Default: 60 seconds. Set to 0 to disable.
pub server_alive_interval: Option<u64>,
/// Maximum keepalive messages without response before disconnect.
/// Default: 3
pub server_alive_count_max: Option<usize>,
}

/// Interactive mode configuration.
Expand Down Expand Up @@ -123,6 +130,13 @@ pub struct ClusterDefaults {
/// Jump host specification for this cluster.
/// Empty string explicitly disables jump host inheritance.
pub jump_host: Option<String>,
/// SSH keepalive interval in seconds.
/// Sends keepalive packets to prevent idle connection timeouts.
/// Default: 60 seconds. Set to 0 to disable.
pub server_alive_interval: Option<u64>,
/// Maximum keepalive messages without response before disconnect.
/// Default: 3
pub server_alive_count_max: Option<usize>,
}

/// Node configuration within a cluster.
Expand Down
6 changes: 6 additions & 0 deletions src/executor/connection_manager.rs
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ use crate::security::SudoPassword;
use crate::ssh::{
client::{CommandResult, ConnectionConfig},
known_hosts::StrictHostKeyChecking,
tokio_client::SshConnectionConfig,
SshClient, SshConfig,
};

Expand All @@ -40,6 +41,11 @@ pub(crate) struct ExecutionConfig<'a> {
pub jump_hosts: Option<&'a str>,
pub sudo_password: Option<Arc<SudoPassword>>,
pub ssh_config: Option<&'a SshConfig>,
/// SSH connection configuration (keepalive settings).
/// Note: This field is currently passed through the executor for future use.
/// Keepalive is applied at the Client::connect_with_ssh_config level.
#[allow(dead_code)]
pub ssh_connection_config: Option<&'a SshConnectionConfig>,
}

/// Execute a command on a node with jump host support.
Expand Down
Loading