Hello,
I’m using future apply function with ssh connections, cluster plan.
However, future apply function makes retransmission of data frame or objects every mapping procedures even I use same datasets or objects when I’m just changing parameters of estimations to find optimal condition.
Pseudo code is here.
data <- mirt::Science
nFactors <- 1:4
future::plan('cluster', workers = paste0('s',1:2))
future.apply::futute_lapply(X=nFactors, FUN = function(X, data){mirt::mirt(data = data, X)}, data = data)
After run this code, let’s watch traffic status. It seems do retransmission of data, even I don’t change any data for the parameter estimation.
How can I reduce data retransmission? That’s make me hard to operate HPC computing on some VPS provider, they makes QoS limit every my calculation.
Best,
Seongho
Hello,
I’m using future apply function with ssh connections, cluster plan.
However, future apply function makes retransmission of data frame or objects every mapping procedures even I use same datasets or objects when I’m just changing parameters of estimations to find optimal condition.
Pseudo code is here.
After run this code, let’s watch traffic status. It seems do retransmission of data, even I don’t change any data for the parameter estimation.
How can I reduce data retransmission? That’s make me hard to operate HPC computing on some VPS provider, they makes QoS limit every my calculation.
Best,
Seongho