willow: "yea you're probably right. i skimmed some tesseract/ocrmypdf docs earlier and didn't see anything too hopeful. if i was more ML pilled i would take up the mantle myself. what i'm envisioning rn is it would take image of a page, recognize footnotes, and crop the image, and OCR would come in after"

Alt Text

Show parent replies

The post below is a reply to another user

yea you're probably right. i skimmed some tesseract/ocrmypdf docs earlier and didn't see anything too hopeful. if i was more ML pilled i would take up the mantle myself. what i'm envisioning rn is it would take image of a page, recognize footnotes, and crop the image, and OCR would come in after

April 7, 2024 at 8:02 PM