-
Notifications
You must be signed in to change notification settings - Fork 91
Description
Description / Summary
I would like to be able to preprocess my notebooks before they are parsed.
My wish is to have similar behavior to nbconvert preprocessors`.
Value / benefit
The main motivation for such an extension is that it would be possible to easily include any notebook transformations and previously written nbconvert preprocessors.
An example of a custom preprocessor:
Instead of having to write cell metadata into the special metadata field, one could simply write a magic comment that would automatically be converted to the metadata:
# remove-output
print("hello world")Could be preprocessed to:
print("hello word")with the metadata tag: remove-output.
I know that it would be possible to simply write the metadata directly into the metadata field, but this is different for jupyterlab, VSCode, etc...
Also, it is often not easy to see what metadata has been written to each cell.
A different solution is to use a preprocessing script to modify the notebook before being parsed, but that requires this extra step outside of the normal pipeline and some preprocessors are not idempotent (for example, the metadata writer processor removes the metadata line to not pollute the output).
Maybe this is a bit far-fetched, but I hope that it makes a little bit of sense. 😅
A top-down motivation would be for the use inside of jupyter-book, where it would be nice to use the metadata preprocessor that I've shown above to select what inputs/outputs to show inside of the code cell and have this magic comment removed to not pollute the printed output.
A different, simpler use-case would be to use a RegexRemovePreprocessor that would remove all cells that match the regular expression.
Implementation details
I believe that this should be done after executing the notebook.
Then there is no difference between the cached view and the original view, which might introduce some issues.
Like, when we use the RegexRemovePreprocessor and it removes a cell that produces an error.
Specifically, I would execute the preprocessors here.
But then there might be an issue with the source-map/pseudo-line numeration when the preprocessor deletes entires cells.
PS: Maybe I am too focused/used to NotebookNode and should rather consider working on the SyntaxTree.
Tasks to complete
No response