We have currently had great success with
- https://github.com/datalab-to/marker
to extract data from pdf to markdown components.
but it would be interesting to compare to a couple of newly released tools:
- https://github.com/Yuliang-Liu/MonkeyOCR
We have currently had great success with
to extract data from
pdftomarkdowncomponents.but it would be interesting to compare to a couple of newly released tools: