Conversation
mschubert
left a comment
There was a problem hiding this comment.
Thanks for the PR! I added some comments, please have a look and explain/amend as appropriate.
| qname = c("SLURM", "LSF", "SGE", "GCS", "OCS", "LOCAL") | ||
| exec = Sys.which(c("sbatch", "bsub", "qsub", "qsub", "qsub")) | ||
| select = c(which(nchar(exec) > 0), 6)[1] |
There was a problem hiding this comment.
This will not work: We're checking available shell commands to guess the scheduler. If qsub is available, this is SGE (by assumption). We can not distinguish between PBS, Torque, GCS, OCS; so it makes no sense to check these here.
There was a problem hiding this comment.
What is your recommendation here?
There was a problem hiding this comment.
I think it's best to leave the code as it was before
| * [GCS](#gcs) - *works without setup* | ||
| * [OCS](#ocs) - *works without setup* |
There was a problem hiding this comment.
This is incorrect, needs clustermq.scheduler to be set (see PBS/Torque)
| * [GCS](https://mschubert.github.io/clustermq/articles/userguide.html#gcs) - *works without setup* | ||
| * [OCS](https://mschubert.github.io/clustermq/articles/userguide.html#ocs) - *works without setup* |
There was a problem hiding this comment.
This is incorrect, needs clustermq.scheduler to be set (see PBS/Torque)
There was a problem hiding this comment.
Ok. Now I understand. From the scheduler end there is no change required. A default OCS/GCS installtion is sufficient.
| the missing options. | ||
|
|
||
|
|
||
| ### GCS |
There was a problem hiding this comment.
The templates and docs of GCS/OCS are a lot more verbose than for the other schedulers. We should try to be consistent here
There was a problem hiding this comment.
Does this mean that I should remove helpfull comments?
| OCS = R6::R6Class("OCS", | ||
| inherit = QSys, | ||
|
|
||
| public = list( |
There was a problem hiding this comment.
Why is the SGE initializer not reused here? Job names are guaranteed to be unique within clustermq; but if IDs are better, we should use them in SGE as well
There was a problem hiding this comment.
I have no access to SGE and do not know if job names were unique back then with the old Sun Microsystems release.
| log_worker=FALSE, log_file=NULL, verbose=TRUE) { | ||
| super$initialize(addr=addr, master=master, template=template) | ||
|
|
||
| # fill the template with options and required fields |
There was a problem hiding this comment.
Please only add comments/whitespace where they add value. If the function is called fill_template, a comment with fill the template is superfluous.
| private$template_error(class(self)[1], status, filled) | ||
|
|
||
| # try to read the job ID from stdout. On error stop | ||
| private$job_id <- regmatches(private$qsub_stdout, regexpr("^[0-9]+", private$qsub_stdout)) |
There was a problem hiding this comment.
Please be consistent with assignments; we use = by convention in this project
| # first call finalize to send qdel ... | ||
| private$finalize() |
There was a problem hiding this comment.
qdel should only be called if there are still running jobs, not otherwise. I believe the cleanup implementation in SGE is correct.
There was a problem hiding this comment.
Ok. This did not work for me when i had the same code for OCS/GCS. qdel was not triggered although the worker tasks where processed and there were still pending jobs...
As discussed in #341, this PR adds native backend support for OCS and GCS.
Summary of changes:
Testing:
Many thanks for the helpful feedback and suggestions.
Best regards,
Ernst