• hodgepodgin@lemmy.zip
    link
    fedilink
    English
    arrow-up
    1
    ·
    18 hours ago

    Since there’s 78 pages, I’m guessing at least 1 ambiguity per page? Anyways, it’s dreadfully big.

    • Rioting Pacifist@lemmy.world
      link
      fedilink
      English
      arrow-up
      2
      ·
      18 hours ago

      2^78 is large but computers can do an awful lot per second, so if only about some the pages contain attachments 2^40-55 is something you could bruteforce in weeks if you can do millions of attempts a second

      • vatlark@lemmy.world
        link
        fedilink
        English
        arrow-up
        1
        ·
        9 hours ago

        I have never looked into the details of an OCR, but if it’s a classifier it should give the it’s confidence in being a 1 or L so you can start with the low confidence characters.