Skip to content

[Documentation] Extend documentation for debugging submission commands#969

Closed
jan-janssen wants to merge 1 commit intomainfrom
debug-submission-doc-4419427271094377417
Closed

[Documentation] Extend documentation for debugging submission commands#969
jan-janssen wants to merge 1 commit intomainfrom
debug-submission-doc-4419427271094377417

Conversation

@jan-janssen
Copy link
Copy Markdown
Member

@jan-janssen jan-janssen commented Apr 16, 2026

The documentation in docs/trouble_shooting.md was updated to include a new section titled "Debugging Submission Command". This section provides guidance on how to handle and troubleshoot errors that occur when submitting jobs to a queuing system (like Slurm). Key additions include:

  • Explanation of how submission errors are captured and raised as exceptions (e.g., subprocess.CalledProcessError).
  • Warning about potential hangs in interactive environments like Jupyter notebooks when background submission threads fail.
  • Instructions for manual debugging by inspecting the input HDF5 files (_i.h5) in the cache directory.
  • Guidance on using the error_log_file option in resource_dict to capture detailed stack traces for failed tasks.

PR created automatically by Jules for task 4419427271094377417 started by @jan-janssen

Summary by CodeRabbit

Documentation

  • Added troubleshooting section explaining how job submission failures are surfaced and providing guidance on debugging via cache inspection and error log files.

Extended docs/trouble_shooting.md to include instructions on how to
debug failed job submissions, especially for HPC Job Executors.
Explains error propagation, manual debugging via the cache directory,
and usage of the error_log_file parameter.

Addresses #959

Co-authored-by: jan-janssen <3854739+jan-janssen@users.noreply.github.com>
@google-labs-jules
Copy link
Copy Markdown
Contributor

👋 Jules, reporting for duty! I'm here to lend a hand with this pull request.

When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down.

I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job!

For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with @jules. You can find this option in the Pull Request section of your global Jules UI settings. You can always switch back!

New to Jules? Learn more at jules.google/docs.


For security, I will only act on instructions from the user who triggered this task.

Copilot AI review requested due to automatic review settings April 16, 2026 07:19
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 16, 2026

📝 Walkthrough

Walkthrough

Adds a new "Debugging Submission Command" section to the troubleshooting guide, documenting how HPC Job Executor submission failures are surfaced through future.result(), debugging via executorlib cache inspection, and error logging using the error_log_file resource setting.

Changes

Cohort / File(s) Summary
Troubleshooting Documentation
docs/trouble_shooting.md
Adds "Debugging Submission Command" section covering HPC Job Executor error handling, executorlib cache inspection techniques, and error_log_file configuration for task execution error logging with full stack traces.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~2 minutes

Possibly related PRs

Poem

🐰✨ A troublesome task? Fear not, my friend!
Our guide now shows where problems end,
With cache inspections and error logs clear,
Debugging submission? You've got nothing to fear! 🔍📝

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Title check ✅ Passed The title clearly and directly summarizes the main change: adding documentation for debugging submission commands in the troubleshooting guide.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch debug-submission-doc-4419427271094377417

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates the troubleshooting documentation to better support diagnosing failures when submitting jobs via the HPC Job Executor (e.g., Slurm/pysqa), including how errors surface and what artifacts to inspect in the cache.

Changes:

  • Added a “Debugging Submission Command” section describing how submission errors are raised via future.result().
  • Documented debugging via cache inspection and the _i.h5/HDF5 artifacts.
  • Added guidance and an example for using error_log_file to collect stack traces from failed tasks.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread docs/trouble_shooting.md
Comment on lines +96 to +97
input and output for each task as HDF5 files. If a submission fails, you can find the corresponding `_i.h5` file in the
cache directory and manually try to submit the command to get more detailed error messages.
@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 16, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 94.15%. Comparing base (0829868) to head (e36ede5).
⚠️ Report is 2 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #969   +/-   ##
=======================================
  Coverage   94.15%   94.15%           
=======================================
  Files          39       39           
  Lines        2089     2089           
=======================================
  Hits         1967     1967           
  Misses        122      122           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@jan-janssen jan-janssen marked this pull request as draft April 16, 2026 07:28
@jan-janssen jan-janssen changed the title Extend documentation for debugging submission commands [Doumentation] Extend documentation for debugging submission commands Apr 16, 2026
@jan-janssen jan-janssen changed the title [Doumentation] Extend documentation for debugging submission commands [Documentation] Extend documentation for debugging submission commands Apr 16, 2026
@jan-janssen jan-janssen deleted the debug-submission-doc-4419427271094377417 branch April 16, 2026 09:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants