Skip to content

ENH pandoc parser minor improvements and support older pandoc#284

Merged
adrinjalali merged 4 commits intoskops-dev:mainfrom
BenjaminBossan:ENH-pandoc-parser-minor-improvements
Jan 25, 2023
Merged

ENH pandoc parser minor improvements and support older pandoc#284
adrinjalali merged 4 commits intoskops-dev:mainfrom
BenjaminBossan:ENH-pandoc-parser-minor-improvements

Conversation

@BenjaminBossan
Copy link
Copy Markdown
Collaborator

@BenjaminBossan BenjaminBossan commented Jan 25, 2023

Description

Add support for LineBreak item

This one was a bit obscure, it requires the model card to contain a line break with trailing whitespace to be triggered.

Add support for older pandoc versions

This one came up when working on a HF space. There, non-Python dependencies can be installed, but only as debian packages via apt. However, the apt repo only has an old pandoc version, which doesn't work with the current implementation.

The solutions are either to support older pandoc versions or to use a Docker space, which seems like overkill. Fortunately, supporting older pandoc versions wasn't as hard as expected. The main change was introduced with (afaict) pandoc 2.5, which changed handling of tables. Now there are two table implementations, old and new, which differ slightly.

Tested and works with the following pandoc versions:

  • 2.0
  • 2.2
  • 2.5
  • 2.19
  • 3.0

Minimum pandoc version has been decreased to 2.0. CI has been adopted to install pandoc via apt, resulting in version 2.9.2.1.

Tested and works with the following pandoc versions:

- 2.0
- 2.2
- 2.5
- 2.19
- 3.0

Main change was with (afaict) pandoc 2.5, which changed handling of
tables. Now there are two table implementations, old and new, which
differ slightly.

Minimum pandoc version has been decreased to 2.0.
@BenjaminBossan
Copy link
Copy Markdown
Collaborator Author

Ready for review @skops-dev/maintainers

Copy link
Copy Markdown
Member

@adrinjalali adrinjalali left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

fi
if [ ${{ matrix.os }} == "ubuntu-latest" ];
then wget -q https://github.com/jgm/pandoc/releases/download/2.19.2/pandoc-2.19.2-1-amd64.deb && sudo dpkg -i pandoc-2.19.2-1-amd64.deb;
then sudo apt install pandoc && pandoc --version;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

woohoo :D

Comment thread skops/card/_markup.py Outdated

def _table_cols(self, items) -> list[str]:
def _table_cols_old(self, items) -> list[str]: # pragma: no cover
# pandoc < 2.5
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any idea when we can drop support for this?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess as soon as apt updates to a more recent pandoc version. I have no idea when that will be.

Comment thread skops/card/_markup.py

columns = self._table_cols(thead_body)

columns = self._table_cols_new(thead_body)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would have expected these lines to be covered.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the table change actually happens after 2.9, not 2.5 as I first suspected. Therefore, given that we now install pandoc via apt on CI, which uses 2.9, the old table implementation is covered, not the new one. I thus moved around the pragmas to the new implementation.

So the table change actually happens after 2.9, not 2.5 as I first
expected. Therefore, given that we now install pandoc via apt on CI,
which uses 2.9, the old table implementation is covered, not the new
one. I thus moved around the pragmas to the new implementation.
Copy link
Copy Markdown
Member

@adrinjalali adrinjalali left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, we basically tested the new stuff, and now also the old stuff. This is good to go. Thanks @BenjaminBossan

@adrinjalali adrinjalali changed the title ENH pandoc parser minor improvements ENH pandoc parser minor improvements and support older pandoc Jan 25, 2023
@adrinjalali adrinjalali merged commit d9b7c36 into skops-dev:main Jan 25, 2023
@BenjaminBossan BenjaminBossan deleted the ENH-pandoc-parser-minor-improvements branch January 25, 2023 15:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants